← All Concepts
Distributed Systems
Raft Consensus Algorithm
A consensus algorithm that enables a cluster of nodes to agree on a sequence of values, even if some nodes fail. Designed to be more understandable than Paxos.
**Raft** ensures all nodes in a cluster agree on the same log of commands, enabling replicated state machines.
**Three roles:**
- **Leader**: Handles all client requests, replicates to followers
- **Follower**: Passive, responds to leader's log entries
- **Candidate**: Trying to become leader during an election
**Key mechanisms:**
1. **Leader Election**: If a follower doesn't hear from the leader (heartbeat timeout), it becomes a candidate and requests votes. Majority vote wins.
2. **Log Replication**: Leader appends entries to its log, sends to followers. Once a majority acknowledges, the entry is committed.
3. **Safety**: A candidate can only win if its log is at least as up-to-date as the majority. This guarantees committed entries are never lost.
**Properties:**
- Tolerates up to (N-1)/2 failures in a cluster of N nodes
- Strong consistency (linearizable reads from leader)
- Typical cluster: 3 or 5 nodes
**Used by**: etcd, CockroachDB, TiKV, Consul, RabbitMQ Quorum Queues.
Common Use Cases
- Distributed configuration management (etcd for Kubernetes)
- Leader election for database primary selection
- Replicated state machines
- Distributed lock services
Advantages
- +Strong consistency guarantees
- +Well-understood and battle-tested
- +Clear leader makes reasoning about state easier
- +Automatic leader recovery on failure
Disadvantages
- -Requires majority of nodes to be available (N/2+1)
- -Leader bottleneck for all writes
- -Cross-region latency for writes (wait for majority)
- -Not suitable for large clusters (typically 3-5 nodes)