EP 3: CTRL+BREAK — Inside the Tech Behind Spotify Wrapped Archive 2025

Tech Behind Spotify Wrapped Archive 2025

Here’s the third article for CTRL+BREAK:

https://engineering.atspotify.com/2026/3/inside-the-archive-2025-wrapped

And as always, huge thanks to the original authors — this one was a masterclass in large-scale AI systems, evaluation, data modeling, and launch engineering.

SPOTIFY WRAPPED ARCHIVE 2025 — SYSTEM BREAKDOWN
STAGE 1 — FIND REMARKABLE DAYS
350M users × 365 days
HEURISTIC RANKER
ordered by narrative potential + statistical strength
Biggest Music Day
Top Artist Day
Top Genre Day
Biggest Podcast Day
Top Podcast Day
Discovery Day
Most Nostalgic
Most Unusual
Birthday
New Year’s Day
Top 5 Days / User
Object Storage
Message Queue
STAGE 2 — BUILD AI WRITER
FRONTIER MODEL
expensive, high quality
GOLD DATASET
human curated
SMALL MODEL
fast + cheap (DPO)
SYSTEM PROMPT
data-driven
brand-safe tone
USER PROMPT
logs + stats + country
prev reports (avoid repetition)
STAGE 3 — GENERATE REPORTS
Message Queue
Generation Engine
per-user sequence, massive parallelism
Distributed KV Database
column-based, parallel writes
STAGE 4 — QUALITY CONTROL
1.4B Reports
Automated Evaluation
LLM judge • accuracy • safety • tone • format
Issues Found
All Good ✓
Remediation Loop
trace → fix → replay
STAGE 5 — LAUNCH
Pre-scale Everything
Compute Pods
Database Nodes
Model Capacity
Load Testing
Launch Spike
global big-bang • no cold start ✓
KEY LESSONS
Less Prompt = More Creativity
Prompting Needs Evaluation
Concurrency = Data Modeling
Isolation Starts in Architecture
LLM Call is Easy; Scale is Hard

Short Summary

Tech Behind Spotify Wrapped Archive 2025

Spotify’s Wrapped Archive 2025 identifies up to five “remarkable days” from a listener’s entire year using a ranked set of heuristics like Biggest Music Day, Discovery Day, Most Nostalgic Day, and deviations from personal taste. After computing these days for ~350M users, Spotify generated 1.4 billion AI-written stories, each created by a fine-tuned, optimized model distilled from a frontier LLM.

The pipeline includes heuristic ranking, a two-layer prompting system, a fully parallel report generator, a column-oriented storage model for safe concurrent writes, and automated LLM-based evaluation for quality control. Wrapped’s global “big bang” launch required aggressive pre-scaling and synthetic load testing across regions to ensure no cold starts.

Wrapped Archive demonstrates how AI storytelling, data engineering, infra scaling, and safety loops come together to ship a feature for hundreds of millions of users.

Leave a Reply