Query & Context

The Sunday Commit

On Saturday, we introduced the "Justin Bieber Problem."

You have a perfectly designed database. You shard your data by User_ID. For 99.9% of users, this is perfect. User A (Shard 1) has 500 rows. User B (Shard 2) has 600 rows.

Then Justin Bieber joins. He is User Z (Shard 4). He has 100 million rows (followers/comments). Every time he posts, Shard 4 catches fire. The CPU spikes to 100%. Meanwhile, Shards 1, 2, and 3 are sitting idle.

This is called Data Skew (or a Hot Key). It breaks distributed systems because the "Average" doesn't matter. The "Outlier" kills you.

1. The Solution: Salting keys

You cannot store Justin Bieber on one shard. He is too big. You must break him into pieces.

The strategy is called Key Salting (or Sharding the Shard).

Instead of storing all his data under User_ID: 12345, we append a random suffix (a salt) to his ID when we write data.

Write: When a new comment comes in for Bieber, we randomly assign it to 12345_1, 12345_2, ... 12345_10.
Distribute: Now, the hashing algorithm sees these as different keys.
- 12345_1 goes to Shard A.
- 12345_2 goes to Shard B.
- 12345_3 goes to Shard C.
Read: When we need to fetch his data, the application knows Bieber is a "Celebrity." It queries all the salted variants (12345_1 to 12345_10) in parallel and merges the results.

For normal users, we don't do this. We only apply this complexity to the "Hot" keys.

2. Yesterday’s Solution

The puzzle was exactly this scenario: What is the specific name for the problem where one partition works much harder than others?

The Answer: Hot Spotting (or Data Skew). And the strategy to fix it is Salting (as described above).

Another common fix in stream processing (like Kafka) is "Local Key By". Instead of shuffling all data instantly, you aggregate data locally on the node for a few seconds before sending it to the "Hot" reducer. This reduces the network traffic by 100x.

3. The Sunday white board

Let's look at a new problem for the week ahead.

The Scenario: You are building a URL shortener (like Bit.ly). You need to convert a long URL into a short, unique 7-character string (e.g., bit.ly/AbC123x). You have 2 database servers. You don't want to use a central "Counter" (like Redis INCR) because it's a single point of failure. You want both servers to generate IDs independently without ever colliding.

The Question: If Server A and Server B both generate random strings, they might collide. How do you configure the ID generation algorithm (Snowflake IDs) so that Server A mathematically cannot generate an ID that Server B generates?

(Reply with the concept!)

4. THE PULSE: Industry Signals

"The Big Ideas Behind Reliable, Scalable, and Maintainable Systems" This is a specific chapter in the Amazon Builders' Library. It explains how Amazon handles "Prime Day" traffic spikes, which is essentially the "Justin Bieber Problem" applied to an entire shopping cart system.

5. THE LATENT SPACE

❝

"Uniformity is a myth."

In school, we learn about "Uniform Distributions." In the real world, everything is a Power Law. One user will be 1000x bigger than the rest. One query will be 1000x more frequent. One file will be 1000x larger.

If you design for the average, the outlier will crush you. Design for the outlier.

See you on Monday.

See you tomorrow.
Harsh Kathiriya - Query & Context

The Justin Bieber problem

Query & Context

The Sunday Commit

1. The Solution: Salting keys

2. Yesterday’s Solution

3. The Sunday white board

Let's look at a new problem for the week ahead.

4. THE PULSE: Industry Signals

5. THE LATENT SPACE

Keep Reading

Query & Context