← All Concepts
Distributed Systems
Consistent Hashing
A hashing technique that minimizes key redistribution when the number of nodes changes in a distributed system.
**Consistent hashing** maps both keys and nodes onto a circular hash space (ring).
**How it works:**
1. Hash both servers and keys onto a ring (0 to 2^32 - 1)
2. Each key is assigned to the first server clockwise from its position
3. When a server is added/removed, only nearby keys are affected
**Virtual nodes:**
- Each physical server gets multiple positions on the ring (100-200 virtual nodes)
- Ensures more even distribution of keys
- Without virtual nodes, small number of servers can lead to unbalanced load
**Key advantage:** When adding/removing a server, only K/N keys need redistribution (K=total keys, N=total servers), compared to nearly all keys with modular hashing.
**Used in:** DynamoDB, Cassandra, Memcached, CDNs, load balancers.
Common Use Cases
- Distributed cache node selection (Memcached)
- Database sharding (DynamoDB, Cassandra)
- CDN request routing
- Distributed hash tables (DHTs)
Advantages
- +Minimal redistribution on node changes (K/N keys)
- +Virtual nodes enable even distribution
- +Simple and efficient to implement
- +Works well for horizontal scaling
Disadvantages
- -Can still have hotspots without enough virtual nodes
- -Slightly more complex than simple modular hashing
- -Virtual nodes increase memory for routing table
- -Non-uniform node capacity requires proportional virtual nodes