Databases don’t all agree on what "consistent" means, and the most common definition is actually the weakest.
Let’s see this in action. Imagine a simple key-value store.
// Initial state
{
"user:123": {
"name": "Alice",
"email": "alice@example.com"
}
}
Now, two clients try to update this record simultaneously.
Client A: Updates email
// Client A's operation
PUT /users/123
{
"name": "Alice",
"email": "alice.new@example.com"
}
Client B: Updates name
// Client B's operation
PUT /users/123
{
"name": "Alice Wonderland",
"email": "alice@example.com"
}
What happens? This is where consistency models diverge.
Strong Consistency
In a strongly consistent system, after a write operation completes, any subsequent read operation will always see the result of that write. It’s like the database has a single, authoritative timeline. If Client A’s write finishes first, any read after that will see the new email. If Client B’s write finishes first, any read after that will see the new name. Importantly, you’ll never see a state where both the name and email have been updated in a single read, unless both operations completed and were reflected.
- System: A traditional relational database like PostgreSQL or MySQL, often configured for synchronous replication.
- How it works: Writes are typically coordinated across replicas. A write is only considered "successful" once it has been acknowledged by a quorum of replicas, ensuring that all subsequent reads will see it.
- Levers:
- Replication Factor: Higher replication means more nodes need to agree, increasing durability but potentially latency.
- Write Concern/Quorum: Defining how many replicas must acknowledge a write before it’s committed. For strong consistency, this is often
majority. - Read Preference: For reads, you’d typically set this to
primaryorquorumto ensure you’re reading from a state that reflects committed writes.
Eventual Consistency
This is the more common model in distributed systems. In an eventually consistent system, if no new updates are made to a given data item, eventually all reads to that item will return the last updated value. The key is "eventually" – there’s a period where different replicas might have different versions of the data.
In our example, after both clients have sent their updates, a read might return:
- The state from Client A (new email, old name).
- The state from Client B (old email, new name).
- A merged state (new email, new name) – but this merge might happen later.
The system will eventually reconcile these differences, but there’s no guarantee when.
- System: Many NoSQL databases like Cassandra, DynamoDB, and cloud services like Amazon S3.
- How it works: Writes are often applied to a subset of nodes (or even just one) and then propagated asynchronously to others. Reads might hit any available replica. Conflict resolution mechanisms (like "last write wins" based on timestamps) are used to reconcile divergent states.
- Levers:
- Consistency Level (Writes): In Cassandra,
ONEmeans a write is acknowledged by just one node (fastest, least consistent),QUORUMmeans a majority (slower, more consistent),ALL(slowest, most consistent, but can block if nodes are down). - Consistency Level (Reads): Similar to writes,
ONEreads from one node,QUORUMreads from a majority and checks for conflicts. - Read Repair: A background process that detects and corrects inconsistent replicas during read operations.
- Anti-Entropy/Hinted Handoff: Mechanisms for ensuring data is propagated even if nodes are temporarily unavailable.
- Consistency Level (Writes): In Cassandra,
Causal Consistency
This model sits between strong and eventual. It guarantees that if one operation causally precedes another, then all processes that see the second operation will also see the first. Causality here means "this operation happened before that one."
In our example:
- If Client A’s write happened before Client B’s write (meaning Client B might have read the state after Client A’s update and then decided to update the name), then any system that sees Client B’s name update must also see Client A’s email update.
- However, if the writes were independent (e.g., both clients read the initial state at the same time and decided to write), their order might not be guaranteed to be seen by all subsequent reads.
Causal consistency preserves the order of causally related events, but not necessarily the order of concurrent events.
- System: Some distributed databases and messaging systems like Google Cloud Spanner, FoundationDB, and certain configurations of Kafka.
- How it works: Systems often use vector clocks or similar mechanisms to track the history of operations. A write is assigned a vector clock, and a read must see all operations that causally precede the data it reads. When propagating writes, the system ensures that a write is only applied if all preceding writes (as indicated by its vector clock) have already been applied.
- Levers:
- Vector Clocks: The core mechanism for tracking causality. Each node maintains a list of counters, one for each other node in the system.
- Write Timestamping/Ordering: Ensuring that writes are assigned an order that reflects their causal dependencies.
- Read Dependencies: Reads are often blocked or delayed until all causally preceding writes have been observed.
The surprising truth is that strong consistency, while intuitive, often comes at the cost of performance and availability in distributed systems. Many modern, high-scale systems opt for weaker models and build application-level logic to handle potential inconsistencies.
The next frontier is understanding how to manage distributed transactions across these different consistency models.