Consensus

Categories
Systems
Sources
Designing Data-Intensive Applications

Getting several nodes to agree on a single value or a single ordering of events, even though some may fail and the network is unreliable. A surprising range of problems, leader election, atomic commit across nodes, uniqueness constraints, totally-ordered message delivery, all turn out to be equivalent to consensus.

Why it Matters

Strong guarantees in a distributed system ultimately reduce to consensus, so recognizing that a problem "is really consensus" tells you it needs a proven algorithm and a majority (quorum), not an ad hoc fix. It also explains why such guarantees are expensive and become unavailable when a majority cannot be reached.

Signals

  • Needing exactly one leader, never two.
  • Needing all nodes to commit a change or none.
  • "We'll just pick the value with the latest timestamp" for something that must be globally agreed.

Benefits

A single, agreed source of truth for ordering and decisions, and correct behavior despite a minority of failures.

Risks

Rolling your own agreement protocol and hitting split brain or lost updates; demanding consensus where eventual convergence would suffice, paying availability for guarantees you do not need.

Tensions

Consensus needs a majority to make progress, so it trades availability under partition for correctness, and it adds latency and operational complexity.

Examples

Electing a single primary so two nodes never both act as leader; an atomic commit where a transaction must apply on all participating nodes or none.