Case: Discord's Move from Cassandra to ScyllaDB for Trillions of Messages
Era: 2015 to 2022 · Author / source: Discord Engineering blog, "How Discord Stores Trillions of Messages" (Bo Ingram, March 2023) · Read alongside: wide-column stores, garbage collection pauses, shard-per-core architectures, data-services pattern
The situation
Discord's message store is the heart of the product. Every voice ping, every server announcement, every screenshot in a community of millions, all of it eventually becomes a row in a database. By 2022 the system was holding trillions of messages.
Discord's first scale-out store was MongoDB, then Cassandra. In 2017 they ran twelve Cassandra nodes holding billions of messages. By early 2022 they were running 177 Cassandra nodes storing trillions of messages. The cluster was the largest single piece of state Discord operated. It was also, according to Discord's own engineering write-up, "high-toil," meaning the on-call team was paged frequently and node-level intervention had become routine.
Specifically: GC pauses on the JVM-based Cassandra nodes "would cause significant latency spikes," and hot partitions, where a small number of very active channels concentrated traffic, "resulted in unbounded concurrency, leading to cascading latency." Compaction backlogs forced Discord to perform what they called a "gossip dance," taking nodes out of rotation so they could compact without traffic, then putting them back. Each of these symptoms was tolerable in isolation; together they made the cluster unpredictable.
The options on the table
Discord considered several paths forward:
- Scale Cassandra harder. Add more nodes, tune JVM flags, fight the GC. Familiar pain, predictable cost, no upper bound on toil.
- Move to a managed wide-column service (DynamoDB, Bigtable). Removes operational burden but introduces vendor lock-in, per-request economics that get ugly at trillions of rows, and migration risk.
- Build on top of a row-store like CockroachDB or YugabyteDB. Strong consistency, SQL ergonomics, but unproven at Discord's write rate and unclear behavior under hot-partition load.
- Migrate to ScyllaDB. Cassandra-compatible wire protocol, but written in C++ with a shard-per-core architecture and no garbage collector.
- Build their own store. Discord's writeup acknowledges this as theoretically possible but practically dismissed. The team did not want to be the company that owns a database.
What they chose, and why
ScyllaDB. The reasoning Discord articulated:
- No JVM, no GC. ScyllaDB is "void of a garbage collector, since it's written in C++ rather than Java." The latency tail Discord was fighting was, at root, a GC tail.
- Shard-per-core architecture. ScyllaDB's design pins data shards to CPU cores, which contains the blast radius of a hot partition to a single core rather than letting it consume an entire node's resources.
- Wire compatibility with Cassandra. The CQL surface area meant Discord could keep their query patterns mostly intact; the migration was a storage swap, not a redesign of how messages are read.
The decision was paired with a second architectural choice: a layer of intermediary services Discord calls data services. As Discord describes them, these are "intermediary services that sit between our API monolith and database clusters," written in Rust. Two responsibilities matter most:
- Request coalescing. When many users open the same channel at the same time, the data service collapses concurrent reads of the same partition into a single underlying query and fans the result back out. This eliminates a class of thundering-herd problems before they reach the database.
- Consistent-hash-based routing. The data services route requests so that the same partition's traffic is concentrated on the same data-service instance, which makes coalescing effective and reduces redundant load.
Discord's writeup is explicit that this layer mattered as much as the database swap. The data store is only as good as the access pattern in front of it.
What they gave up
- Operational familiarity. ScyllaDB is a smaller community than Cassandra. Fewer Stack Overflow answers, fewer engineers who have run it at scale, fewer vendors to call.
- Years of tribal Cassandra knowledge. Every tuning lesson learned in the prior six years was now partially obsolete.
- Migration cost. Moving trillions of rows is not free. Discord designed a migration pipeline capable of moving "up to 3.2 million per second" sustained, and even at that rate the cutover took serious calendar time.
- Bet exposure to a single vendor's roadmap. Cassandra is broadly governed. ScyllaDB is principally driven by ScyllaDB Inc. Discord is now meaningfully dependent on that company's continued investment.
How it played out
The numbers are concrete. Per Discord's writeup:
- Node count: 177 Cassandra nodes -> 72 ScyllaDB nodes, a 59% reduction.
- Historical message fetch latency: p99 dropped from 40 to 125 ms (depending on workload) to a steady 15 ms.
- Message insert latency: p99 dropped from 5 to 70 ms to a steady 5 ms.
- Operational character: Discord described the post-migration cluster as "quiet, well-behaved."
The migration completed in May 2022. The cluster has since handled the World Cup spike of 2022, which Discord cited as a moment that would historically have demanded all-hands incident response, and instead passed without drama.
The data-services pattern outlasted the database swap. It is now a general Discord pattern for any high-throughput data domain, not just messages.
Where it ties to this bank's patterns
- [[wide-column-stores]]: the underlying primer on Cassandra-style data models.
- [[garbage-collection-tail-latency]]: the failure mode that drove the choice. GC tails are not a bug; they are a property.
- [[hot-partition-mitigation]]: the architectural pain that pure database scaling cannot fix.
- [[data-services-pattern]] (a.k.a. caching aggregator, request coalescing layer): the equally important second decision.
- Problem links: any system-design problem involving high-write chat, social timelines, or feed storage at scale.
What a candidate should take away
- The database is half the story; the access pattern is the other half. Discord's gains came from ScyllaDB plus the Rust data services, not from ScyllaDB alone.
- Latency tails are usually GC tails. If your tail is dominated by occasional 200 ms spikes on an otherwise 5 ms p50, suspect the runtime before you suspect the schema.
- Hot partitions are inherent to social products. Some channels are 1000x more active than the median. Any design that treats partitions as uniformly loaded will fail.
- Wire compatibility is a migration accelerator. Discord did not have to rewrite query logic; they got to swap storage. That alone made the migration tractable.
- Toil is a real cost. "High-toil" was the phrase Discord used for the Cassandra cluster. If your on-call burden is rising linearly with usage, the system is structurally wrong, not just under-tuned.
What an AI agent would not have got right
- An AI prompted to "design a chat message store at Discord's scale" will reach for Cassandra by default, because that is what the long-form blog corpus contains. It will not warn about the GC tail or the hot-partition problem.
- It will treat the database as the entire architecture and skip the request-coalescing layer. The data-services pattern is the genuinely architect-grade decision, and AI text generation rarely surfaces it without prompting.
- It will overestimate the difficulty of migration and underestimate the leverage of wire compatibility. AI advice tends toward "you cannot move trillions of rows; live with what you have," which is the wrong lesson.
- It will not flag that hot partitions in chat are inevitable. The default suggestion is "use a good hash function," which solves a problem Discord did not have.
Sources
- Discord Engineering blog, "How Discord Stores Trillions of Messages" (March 2023): https://discord.com/blog/how-discord-stores-trillions-of-messages