← All Concepts
Distributed Systems

Raft Consensus Algorithm

A consensus algorithm that enables a cluster of nodes to agree on a sequence of values, even if some nodes fail. Designed to be more understandable than Paxos.

**Raft** ensures all nodes in a cluster agree on the same log of commands, enabling replicated state machines. **Three roles:** - **Leader**: Handles all client requests, replicates to followers - **Follower**: Passive, responds to leader's log entries - **Candidate**: Trying to become leader during an election **Key mechanisms:** 1. **Leader Election**: If a follower doesn't hear from the leader (heartbeat timeout), it becomes a candidate and requests votes. Majority vote wins. 2. **Log Replication**: Leader appends entries to its log, sends to followers. Once a majority acknowledges, the entry is committed. 3. **Safety**: A candidate can only win if its log is at least as up-to-date as the majority. This guarantees committed entries are never lost. **Properties:** - Tolerates up to (N-1)/2 failures in a cluster of N nodes - Strong consistency (linearizable reads from leader) - Typical cluster: 3 or 5 nodes **Used by**: etcd, CockroachDB, TiKV, Consul, RabbitMQ Quorum Queues.

Common Use Cases

  • Distributed configuration management (etcd for Kubernetes)
  • Leader election for database primary selection
  • Replicated state machines
  • Distributed lock services

Advantages

  • +Strong consistency guarantees
  • +Well-understood and battle-tested
  • +Clear leader makes reasoning about state easier
  • +Automatic leader recovery on failure

Disadvantages

  • -Requires majority of nodes to be available (N/2+1)
  • -Leader bottleneck for all writes
  • -Cross-region latency for writes (wait for majority)
  • -Not suitable for large clusters (typically 3-5 nodes)