The Scale Out Disaster
You have 3 Redis servers (A, B, C) handling massive traffic. They are running at 90% CPU. You decide to add a 4th server (D) to help. You deploy the config change. Instantly, your database crashes.
Why? You fell into the Modulo Trap.
1. The Problem: Hash(key) % N
Most developers load balance using simple modulo math: server_index = hash(user_id) % number_of_servers
With 3 Servers:
User 10:
10 % 3 = 1(Server B)User 11:
11 % 3 = 2(Server C)
With 4 Servers (You added Server D):
User 10:
10 % 4 = 2(Server C) [MOVED!]User 11:
11 % 4 = 3(Server D) [MOVED!]
When you changed N from 3 to 4, you didn't just move the new users. You shuffled 75% of the existing keys to different servers. Suddenly, 75% of your cache lookups failed (MISS). All that traffic hit the database instantly.
2. The Solution: Consistent Hashing (The Ring)
To fix this, we stop mapping keys to servers. We map both Keys and Servers to a Ring (a circle from 0 to 360 degrees).
Place Servers on the Ring: Hash the server IP to pick a spot (e.g., Server A is at 0°, Server B at 120°, Server C at 240°).
Place Keys on the Ring: Hash the User ID to pick a spot (e.g., User 10 is at 45°).
The Rule: To find where a key lives, walk clockwise on the ring until you hit a server.
User 10 (45°) walks clockwise -> Hits Server B (120°).
The Magic: When you add Server D at 60°:
User 10 (45°) walks clockwise -> Hits Server D (60°).
Keys at 200° still hit Server C (240°).
Keys at 300° still hit Server A (0°).
Only the keys between Server A and Server D move. Everything else stays put. You only move 1/N keys (the theoretical minimum). Your cache survives.
3. THE CEREBRAL GYM: Solution & New Puzzle
Yesterday's solution (The Ring) The puzzle was: What algorithm allows you to add servers without reshuffling all keys? The Answer: Consistent Hashing (as described above). This is used by Amazon DynamoDB, Cassandra, and Discord to handle millions of concurrent connections.
Today's puzzle (Probabilistic Data Structures) Monday is for Algorithms.
You are building a signup form. You need to check if the username "cool_guy_123" is already taken. You have 1 billion users. Checking the SQL database for every keystroke is too slow. You want a data structure that sits in memory and tells you:
"No, this username is definitely NOT taken."
"Yes, this username MIGHT be taken." (False positives are okay, False negatives are forbidden).
The Question: What is the specific name of this probabilistic bit-array structure?
(Reply with the name!)
4. THE PULSE: Tools of the Day
Discord's Hash Ring Article Discord wrote the definitive guide on this. They handle billions of messages. They realized that standard Consistent Hashing created "lumpy" distribution (some servers got hot), so they added Virtual Nodes (putting each server on the ring 100 times) to smooth it out. Link: discord.com/blog/how-discord-scaled-elixir
Ketama (The Standard Lib) If you need to implement this, don't write it from scratch. Ketama is the industry standard algorithm for Consistent Hashing. There are wrappers for Python (
hash_ring), Go, and Node. Link: pypi.org/project/hash_ringHAProxy You don't always need to code this. HAProxy supports
hash-type consistentout of the box. If you are load balancing WebSockets or sticky sessions, just flip this config switch. Link: haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know
5. THE LATENT SPACE
"A system that cannot scale down is just as broken as a system that cannot scale up."
We obsess over growing. But sometimes, nodes die. Cloud instances vanish. If losing one server causes your entire cluster to reshuffle and choke, you don't have a distributed system. You have a fragile monolith split into pieces.
Build for the failure.
See you tomorrow.
Harsh Kathiriya - Query & Context

