Tech Behind Spotify Wrapped Archive 2025

Here’s the third article for CTRL+BREAK:

https://engineering.atspotify.com/2026/3/inside-the-archive-2025-wrapped

And as always, huge thanks to the original authors — this one was a masterclass in large-scale AI systems, evaluation, data modeling, and launch engineering.

SPOTIFY WRAPPED ARCHIVE 2025 — SYSTEM BREAKDOWN

STAGE 1 — FIND REMARKABLE DAYS

350M users × 365 days

↓

HEURISTIC RANKER
ordered by narrative potential + statistical strength

Biggest Music Day
Top Artist Day
Top Genre Day

Biggest Podcast Day
Top Podcast Day

Discovery Day
Most Nostalgic
Most Unusual

Birthday
New Year’s Day

↓

Top 5 Days / User

↓

Object Storage

↓

Message Queue

STAGE 2 — BUILD AI WRITER

FRONTIER MODEL
expensive, high quality

→

GOLD DATASET
human curated

→

SMALL MODEL
fast + cheap (DPO)

SYSTEM PROMPT
data-driven
brand-safe tone

USER PROMPT
logs + stats + country
prev reports (avoid repetition)

STAGE 3 — GENERATE REPORTS

Message Queue

↓

Generation Engine
per-user sequence, massive parallelism

↓

Distributed KV Database
column-based, parallel writes

STAGE 4 — QUALITY CONTROL

1.4B Reports

↓

Automated Evaluation
LLM judge • accuracy • safety • tone • format

↓

Issues Found

All Good ✓

Remediation Loop
trace → fix → replay

STAGE 5 — LAUNCH

Pre-scale Everything

Compute Pods

Database Nodes

Model Capacity

Load Testing

↓

Launch Spike
global big-bang • no cold start ✓

KEY LESSONS

Less Prompt = More Creativity

Prompting Needs Evaluation

Concurrency = Data Modeling

Isolation Starts in Architecture

LLM Call is Easy; Scale is Hard

Short Summary

Tech Behind Spotify Wrapped Archive 2025

Spotify’s Wrapped Archive 2025 identifies up to five “remarkable days” from a listener’s entire year using a ranked set of heuristics like Biggest Music Day, Discovery Day, Most Nostalgic Day, and deviations from personal taste. After computing these days for ~350M users, Spotify generated 1.4 billion AI-written stories, each created by a fine-tuned, optimized model distilled from a frontier LLM.

The pipeline includes heuristic ranking, a two-layer prompting system, a fully parallel report generator, a column-oriented storage model for safe concurrent writes, and automated LLM-based evaluation for quality control. Wrapped’s global “big bang” launch required aggressive pre-scaling and synthetic load testing across regions to ensure no cold starts.

Wrapped Archive demonstrates how AI storytelling, data engineering, infra scaling, and safety loops come together to ship a feature for hundreds of millions of users.

EP 3: CTRL+BREAK — Inside the Tech Behind Spotify Wrapped Archive 2025

Short Summary

Tech Behind Spotify Wrapped Archive 2025

Leave a Reply

Short Summary

Tech Behind Spotify Wrapped Archive 2025

Share this:

You Might Also Like

EP 2: CTRL+BREAK How Blinkit Engineered Instant Print Delivery

EP1: How Netflix’s MediaFM Understands Movies Using AI

Leave a Reply