← All Concepts
Distributed Systems

Vector Clocks

A mechanism for tracking causality in distributed systems. Each node maintains a vector of logical timestamps to determine the ordering of events.

**Vector clocks** solve the problem of ordering events in distributed systems where there's no global clock. **How it works:** - Each node maintains a vector [N1: t1, N2: t2, N3: t3] with one counter per node - When a node does work, it increments its own counter - When sending a message, attach the full vector - When receiving, take element-wise max and increment own counter **Comparing events:** - Event A **happened before** B if all of A's counters ≤ B's counters - Events are **concurrent** if neither happened before the other - Concurrent events = potential conflict → application must resolve **Example (3 nodes: A, B, C):** ``` A writes x=1 → [A:1, B:0, C:0] B writes x=2 → [A:0, B:1, C:0] (concurrent with A!) A sends to C → C: [A:1, B:0, C:1] ``` **Limitations:** Vector size grows with number of nodes. Use dotted version vectors or interval tree clocks for large clusters.

Common Use Cases

  • Conflict detection in multi-master replication (DynamoDB, Riak)
  • Determining causal ordering of events
  • Optimistic concurrency control in distributed systems
  • Version tracking in collaborative editing

Advantages

  • +Accurately detects causality and concurrency
  • +No need for synchronized clocks
  • +Enables conflict detection without data loss
  • +Works across network partitions

Disadvantages

  • -Vector size grows linearly with number of nodes
  • -Storage overhead per data item
  • -Application must handle conflict resolution
  • -Complex to implement correctly