Database Sharding
When a single database server can no longer handle the load, sharding distributes data across multiple servers. It is the most complex scalability technique for databases — understanding it conceptually is required for senior system design interviews.
Types of Partitioning:
Vertical Sharding: split by table/feature
DB1: users, sessions, auth
DB2: posts, comments, likes
DB3: orders, payments, invoices
Easy but limited — eventually each DB grows too large
Horizontal Sharding: split rows of same table
Shard 1: users where id % 4 = 0 (users 4, 8, 12...)
Shard 2: users where id % 4 = 1 (users 1, 5, 9...)
Shard 3: users where id % 4 = 2 (users 2, 6, 10...)
Shard 4: users where id % 4 = 3 (users 3, 7, 11...)
Shard Key Selection
import hashlib
class ShardRouter:
def __init__(self, shard_count: int, shards: list):
self.shard_count = shard_count
self.shards = shards # list of DB connections
def get_shard(self, key: str) -> object:
# Consistent hashing: distribute evenly
hash_val = int(hashlib.md5(str(key).encode()).hexdigest(), 16)
shard_id = hash_val % self.shard_count
return self.shards[shard_id]
router = ShardRouter(4, [db1, db2, db3, db4])
# All of user 42's data goes to same shard
shard = router.get_shard(user_id=42)
shard.query('INSERT INTO posts (user_id, title) VALUES (?, ?)', [42, 'Guide'])
Problems with Sharding
1. Cross-shard queries are expensive
SELECT * FROM posts ORDER BY views DESC LIMIT 10
→ Must query ALL shards, merge results in app
2. Cross-shard transactions are complex
→ Requires distributed transactions (2PC) or eventual consistency
3. Hot shards (uneven distribution)
→ Some users are 100x more active than others
→ Solution: consistent hashing with virtual nodes
4. Resharding is painful
→ Adding a 5th shard means moving ~20% of data
→ Solution: consistent hashing minimizes data movement
5. JOINs across shards
→ Cannot JOIN posts table (shard 1) with users table (shard 3)
→ Solution: denormalize, or route to same shard by user_id
Q: When should you shard vs other scaling strategies?
Try these first (much simpler): add read replicas, add Redis caching, optimize queries and indexes, use connection pooling, upgrade server hardware. Shard only when a single database cannot handle write throughput even with optimization, or when data volume exceeds what one server can store (typically 5-10 TB+). Twitter, Facebook, and WhatsApp all shard — but only after exhausting simpler options.
Comments (0)
No comments yet. Be the first!
Leave a Comment