CAP Theorem
A distributed system can only guarantee two of three: Consistency, Availability, and Partition Tolerance.
The Three Properties
- Consistency (C): Every read receives the most recent write or an error.
- Availability (A): Every request receives a (non-error) response, without guarantee it's the latest data.
- Partition Tolerance (P): The system continues operating despite dropped or delayed messages between nodes.
The Real Constraint
Network partitions are unavoidable in any distributed system. Therefore the real trade-off is C vs A during a partition. You must choose: do you serve stale data (AP) or reject the request (CP)?
Practical Mapping
| System | Type | Reasoning |
|---|---|---|
| Cassandra | AP | Available, eventually consistent |
| HBase | CP | Consistent, may reject requests |
| Zookeeper | CP | Consistency for coordination |
| DynamoDB | AP (tunable) | Configurable consistency levels |
PACELC Extension
CAP only models failure states. PACELC extends it: even without partitions, there's a latency vs consistency trade-off.
Related Concepts
A distributed hashing scheme that minimizes key remapping when nodes are added or removed.
Horizontal partitioning of a database across multiple machines to distribute load beyond a single server's capacity.
The problem of getting distributed nodes to agree on a single value despite network failures and partial outages. The theoretical foundation behind etcd, ZooKeeper, and Kafka leader election.