CAP Theorem Explained
The CAP theorem, proven by Eric Brewer in 2000, states that a distributed data store can provide at most two of three guarantees: Consistency, Availability, and Partition Tolerance. Understanding it changes how you reason about database selection and system failure modes.
The Three Properties
Consistency (C)
Every read receives the most recent write or an error. All nodes in the cluster see the same data at the same time. If you write a value to node A, reading from node B immediately after must return that same value.
Availability (A)
Every request receives a non-error response — but without a guarantee that it contains the most recent write. The system stays online and responsive even if some nodes are unreachable.
Partition Tolerance (P)
The system continues operating even when network partitions occur — when messages between nodes are arbitrarily dropped or delayed. A network partition means some nodes can't communicate with others.
Why Only Two?
Network partitions are not optional in distributed systems. Physical hardware fails. Cables are cut. Cloud availability zones lose connectivity. Packets get dropped. Because P is non-negotiable in any real distributed deployment, the practical choice reduces to: when a partition happens, do you sacrifice C or A?
CP Systems
CP systems choose consistency over availability during a partition. If a node can't confirm it has the latest data, it refuses to serve the request.
- Zookeeper — Distributed coordination service. Uses Zab consensus protocol. Leader must be reachable for writes. Used for distributed locks, leader election, and configuration.
- etcd — Key-value store powering Kubernetes. Uses Raft consensus. Refuses reads and writes if quorum is lost.
- HBase — Wide-column store on top of HDFS. Provides strong consistency for reads/writes within a row.
Use CP systems when correctness is non-negotiable: financial transactions, distributed locks, configuration management.
AP Systems
AP systems choose availability over consistency during a partition. All nodes stay online and serve requests, but they may return stale data. They use eventual consistency — given enough time without new writes, all nodes converge to the same value.
- Cassandra — Wide-column store with tunable consistency. Default is eventual consistency but you can configure quorum reads/writes for stronger guarantees at the cost of latency.
- DynamoDB — AWS managed key-value store. Offers eventually consistent reads by default with strongly consistent reads as an option.
- CouchDB — Document database with multi-master replication and conflict resolution.
PACELC Extension
CAP only addresses trade-offs during a partition. In 2012, Daniel Abadi proposed PACELC, which adds a second trade-off: even when the system is running normally (Else), there is a trade-off between Latency (L) and Consistency (C).
For example, DynamoDB is PA/EL (available during partition, low latency during normal operation). HBase is PC/EC (consistent during partition, consistent during normal operation).
Practical Implications
In practice, modern databases offer tunable consistency. Cassandra lets you set consistency levels per query: ONE, QUORUM, or ALL. DynamoDB offers strongly consistent reads. These give you a dial rather than a binary choice — but the CAP trade-off still governs what's achievable under network partitions.