Distributed Systems

Raft Consensus Algorithm

A consensus algorithm that enables a cluster of nodes to agree on a sequence of values, even if some nodes fail. Designed to be more understandable than Paxos.

**Raft** ensures all nodes in a cluster agree on the same log of commands, enabling replicated state machines. **Three roles:** - **Leader**: Handles all client requests, replicates to followers - **Follower**: Passive, responds to leader's log entries - **Candidate**: Trying to become leader during an election **Key mechanisms:** 1. **Leader Election**: If a follower doesn't hear from the leader (heartbeat timeout), it becomes a candidate and requests votes. Majority vote wins. 2. **Log Replication**: Leader appends entries to its log, sends to followers. Once a majority acknowledges, the entry is committed. 3. **Safety**: A candidate can only win if its log is at least as up-to-date as the majority. This guarantees committed entries are never lost. **Properties:** - Tolerates up to (N-1)/2 failures in a cluster of N nodes - Strong consistency (linearizable reads from leader) - Typical cluster: 3 or 5 nodes **Used by**: etcd, CockroachDB, TiKV, Consul, RabbitMQ Quorum Queues.

Common Use Cases

Distributed configuration management (etcd for Kubernetes)
Leader election for database primary selection
Replicated state machines
Distributed lock services

Advantages

+Strong consistency guarantees
+Well-understood and battle-tested
+Clear leader makes reasoning about state easier
+Automatic leader recovery on failure

Disadvantages

-Requires majority of nodes to be available (N/2+1)
-Leader bottleneck for all writes
-Cross-region latency for writes (wait for majority)
-Not suitable for large clusters (typically 3-5 nodes)

Related Concepts

CAP Theorem Database Replication Distributed Transactions

← PreviousVector Clocks Next →Merkle Trees