← All Concepts
Distributed Systems
Gossip Protocol
A peer-to-peer communication mechanism where nodes periodically exchange state with random peers, spreading information like an epidemic.
**Gossip protocols** (also called epidemic protocols) are used to disseminate information across a distributed cluster without a central coordinator.
**How it works:**
1. Each node periodically picks a random peer
2. They exchange their state (membership, heartbeats, metadata)
3. Both nodes merge the information
4. Repeat — information spreads exponentially
**Key properties:**
- **Convergence**: All nodes eventually get the same info in O(log N) rounds
- **Fault tolerant**: No single point of failure
- **Scalable**: Each node only talks to a few peers per round
- **Eventually consistent**: Takes time for updates to propagate
**Failure detection:**
- Each node maintains a heartbeat counter
- If a node's heartbeat isn't incremented after timeout → suspected failure
- Multiple nodes must agree before marking a node as failed
**Common tools**: Cassandra (gossip for membership), Consul, SWIM protocol, Serf.
Common Use Cases
- Cluster membership and failure detection (Cassandra, DynamoDB)
- Disseminating configuration changes across nodes
- Decentralized database replication metadata
- Peer-to-peer network coordination
Advantages
- +No single point of failure
- +Scales to thousands of nodes
- +Simple to implement
- +Resilient to network partitions
Disadvantages
- -Eventually consistent — not immediate
- -Bandwidth overhead from periodic messages
- -Convergence time grows with cluster size
- -Difficult to debug in large clusters