← All Concepts
Distributed Systems
Vector Clocks
A mechanism for tracking causality in distributed systems. Each node maintains a vector of logical timestamps to determine the ordering of events.
**Vector clocks** solve the problem of ordering events in distributed systems where there's no global clock.
**How it works:**
- Each node maintains a vector [N1: t1, N2: t2, N3: t3] with one counter per node
- When a node does work, it increments its own counter
- When sending a message, attach the full vector
- When receiving, take element-wise max and increment own counter
**Comparing events:**
- Event A **happened before** B if all of A's counters ≤ B's counters
- Events are **concurrent** if neither happened before the other
- Concurrent events = potential conflict → application must resolve
**Example (3 nodes: A, B, C):**
```
A writes x=1 → [A:1, B:0, C:0]
B writes x=2 → [A:0, B:1, C:0] (concurrent with A!)
A sends to C → C: [A:1, B:0, C:1]
```
**Limitations:** Vector size grows with number of nodes. Use dotted version vectors or interval tree clocks for large clusters.
Common Use Cases
- Conflict detection in multi-master replication (DynamoDB, Riak)
- Determining causal ordering of events
- Optimistic concurrency control in distributed systems
- Version tracking in collaborative editing
Advantages
- +Accurately detects causality and concurrency
- +No need for synchronized clocks
- +Enables conflict detection without data loss
- +Works across network partitions
Disadvantages
- -Vector size grows linearly with number of nodes
- -Storage overhead per data item
- -Application must handle conflict resolution
- -Complex to implement correctly