Consistency Models

By Pritesh Yadav 22 min read

Imagine you save data in one place and it instantly appears everywhere. That is the dream. Reality is messier. In a distributed system your data lives on many machines at once (we call each copy a replica). When you change the data on one replica, the others do not learn about the change for free — the new value has to travel across the network, and the network is slow and sometimes broken. So at any instant the replicas can disagree about what the "current" value is.

This raises a hard question: when I read the data, which value am I allowed to get back? The newest one? An old one? Some value in between? A consistency model is the answer to that question. It is a contract (a written promise) between the storage system and you, the programmer, about what reads are allowed to return given the writes that happened. It tells you exactly which answers are legal and which are bugs.

Key takeaway: "Consistency" is not one thing. It is a spectrum of contracts, from very strict (every read sees the latest write) to very loose (reads may see old data for a while). A consistency model is the precise rule that says which read results are correct. Pick the model on purpose — never by accident.

6.1 Why "consistency" has so many meanings

The word "consistency" is overloaded. The "C" in the ACID properties of databases (a set of guarantees for transactions) means "the database never breaks its own rules" (like "account balance can't go negative"). The "C" in the CAP theorem (covered in the previous section) means linearizability — every read sees the latest write. The consistency models in this section are about a third, more general idea: across many replicas, what ordering of reads and writes is each observer allowed to see?

Different systems pick different contracts because the contract you choose decides three things at once:

  • Correctness — how fresh and ordered the data your code sees will be.
  • Latency — how long a read or write takes (strict contracts force machines to talk to each other before answering, which is slow).
  • Availability — whether the system can still answer when the network is partly broken.

Stronger contract = easier to reason about, but slower and less available. Weaker contract = faster and more available, but your code must tolerate surprises (like reading old data). This is the whole game.

6.2 The consistency ladder (strongest at the top)

Here is the map for the whole section. We will climb down it rung by rung. Higher = more guarantees, more coordination, more latency. Lower = fewer guarantees, less coordination, faster and more available.

        STRONGER  (more coordination, slower, less available)
        ▲
   ┌────┴─────────────────────────────────────────────┐
   │  LINEARIZABILITY    "one single up-to-date copy"  │  reads ALWAYS see latest write,
   │  (a.k.a. strong /                                  │  respects real-time order
   │   atomic / external)                               │  e.g. etcd, ZooKeeper, Spanner
   ├───────────────────────────────────────────────────┤
   │  SEQUENTIAL CONSISTENCY   "one agreed order,       │  everyone agrees on ONE order,
   │                            but not real-time"      │  but it may lag real time
   ├───────────────────────────────────────────────────┤
   │  CAUSAL CONSISTENCY   "cause comes before effect"  │  related writes seen in order;
   │  (ties back to vector clocks)                      │  unrelated writes may differ
   ├───────────────────────────────────────────────────┤
   │  CLIENT-CENTRIC GUARANTEES  "sane within YOUR own  │  read-your-writes, monotonic
   │  (session guarantees)        session"             │  reads/writes, writes-follow-reads
   ├───────────────────────────────────────────────────┤
   │  EVENTUAL CONSISTENCY   "agree LATER, if writes    │  reads may be stale; replicas
   │                          ever stop"               │  converge eventually
   └───────────────────────────────────────────────────┘
        ▼
        WEAKER  (less coordination, faster, more available)

6.3 Linearizability (strong / atomic / external consistency)

What it is. Linearizability is the strongest practical consistency model. The promise is simple to state: the whole distributed system behaves as if there is only one single copy of the data, and every operation happens instantly at one moment in time. Even though there are really many replicas behind the scenes, you can never tell. The term was defined precisely by Maurice Herlihy and Jeannette Wing in their 1990 paper "Linearizability: A Correctness Condition for Concurrent Objects" (ACM TOPLAS, vol. 12, no. 3).

The problem it solves. Without it, two people reading "the same" value at the same time can get different answers, and a read that happens after a write finishes might still return the old value. That makes programming nearly impossible for things like bank balances and locks. Linearizability removes all that confusion: once a write finishes, every later read sees it. Full stop.

How it works, step by step. An operation (a read or a write) is not instantaneous in real life — it has a start (when you send the request, the invocation) and an end (when you get the response). Linearizability requires that each operation appears to take effect at one single instant somewhere between its start and its end. Three rules must hold (as the Jepsen project states them):

  1. Single order — there is one total order of all operations (everyone agrees on it).
  2. Real-time order — if operation A finishes before operation B starts in actual wall-clock time, then A must come before B in that order. You cannot reorder things that did not overlap.
  3. Correct values — that order must make sense for the data type (a read returns the value of the most recent write before it).

The real-time rule is the heart of it. It is what makes the system feel like one fresh copy.

Real time ──────────────────────────────────────────────►

Client A:   [ write x = 5 ]
                          └─ finished here
Client B:                      [ read x ] ──► MUST return 5
                                              (B started AFTER A finished,
                                               so B cannot see the old value)

Overlapping case (allowed either way):
Client A:   [ write x = 5 .............. ]
Client B:        [ read x ] ──► may return old OR 5
                  (they overlap in time, so order is the system's choice,
                   but ALL observers must agree on whichever it picks)
Analogy: Linearizability is a single shared whiteboard in one room. Anyone who walks up writes on it, and the instant they step back, the next person to look sees exactly what was written. There is only one board, so there is no way to see stale information. The fact that the company secretly has copies of the board in other cities does not matter — the system guarantees they are kept perfectly in sync, so it looks like one board.
Example: etcd and ZooKeeper are coordination stores used to elect leaders and hold distributed locks. They give linearizable reads and writes (etcd does this with the Raft consensus algorithm). If two servers both try to grab the same lock, linearizability guarantees exactly one wins and everyone agrees who — which is the entire point of a lock. Google Spanner goes even further with a property called external consistency (linearizability extended to multi-operation transactions). It uses TrueTime, an API backed by GPS receivers and atomic clocks, to put tightly-bounded real timestamps on transactions, then deliberately waits out the clock uncertainty (the "commit wait", typically a few milliseconds) before committing. That wait is the price of strong consistency made visible.

The cost. To honor "every read sees the latest write," replicas must coordinate — usually a read or write must reach a majority of nodes (a quorum) or go through one leader. That means network round-trips on every operation. A nearby stale read might take ~1 ms; a linearizable read that must round-trip to a leader in another region can take 50–200 ms. And per the CAP theorem, linearizability cannot stay available during a network partition: when the network splits, some nodes must stop answering rather than risk returning stale data.

6.4 Sequential consistency

What it is. Sequential consistency keeps one part of linearizability and drops the other. It still requires that everyone agrees on one single global order of all operations, and that each process's own operations appear in that order in the sequence they were issued (its program order). But it drops the real-time rule. The agreed-upon order does not have to match wall-clock time. The classic definition comes from Leslie Lamport (1979): the result is the same as if all operations were executed in some sequential order, with each processor's operations appearing in the order its program issued them.

Why it exists / what it relaxes. Real-time ordering is the expensive part of linearizability, because honoring it forces tight coordination on every operation. If you only need everyone to agree on one order — but you don't care that the order exactly tracks the clock — you can sometimes go faster. The catch: a read may return a value that is technically "in the past" relative to wall time, as long as every observer sees that same consistent story.

Analogy: Think of a movie everyone watches, but on a slight broadcast delay. Everybody in the world sees the exact same scenes in the exact same order — there is one timeline and no one disagrees about what happened next. But what they see is a few seconds behind real life. The order is shared and consistent; it just isn't "live." Linearizability is the live broadcast; sequential consistency is the synchronized-but-delayed broadcast.
Example: Suppose Alice posts "A" then "B"; Bob posts "X" then "Y". Under sequential consistency every reader might see the order A, X, B, Y — Alice's A-before-B and Bob's X-before-Y are both preserved, and everyone sees that same interleaving. What is forbidden is one reader seeing A, B while another sees B, A (that breaks the single agreed order). Older multiprocessor CPU memory systems were often designed to be sequentially consistent.
Common mistake: Thinking sequential consistency means "reads are fresh." It does not. It only means there is one agreed order. A reader can sit reading a stale-but-consistent view forever, as long as that view is part of the one global order everyone agrees on. Freshness (real-time) is what linearizability adds on top.

6.5 Causal consistency

What it is. Causal consistency is weaker than sequential, but it captures the ordering that humans actually care about: cause before effect. The rule: if one operation causally precedes another, then every node must see them in that order. Operations that are concurrent (neither caused the other) may be seen in different orders on different nodes — and that is allowed.

What "causally precedes" means. Operation A causally precedes operation B if any of these hold: they were done by the same client and A came first; or B read the value that A wrote; or there is a chain of such links connecting them. This is exactly the "happens-before" relationship that vector clocks track (from the logical-clocks section). Vector clocks let each node detect, for any two operations, whether one happened-before the other or whether they are concurrent. That is precisely the information causal consistency needs to enforce its rule.

Causal chain that MUST be preserved everywhere:

  Alice: post  "Lost my keys 😫"        (operation P)
                     │  (Bob READS P, so his reply is caused by it)
                     ▼
  Bob:   comment "Found them by the door!" (operation C, C depends on P)

  Every node must show P before C.  Showing C ("Found them!") with no
  visible P makes Bob look insane. Causal consistency forbids that.

Concurrent writes (no cause link) — order may differ per node:

  Carol: post "Happy Friday!"   ─┐  neither read the other,
  Dave:  post "Good morning!"   ─┘  so node1 may show Carol-then-Dave,
                                    node2 Dave-then-Carol. Both OK.
Analogy: A group chat. A question and its answer must always appear in that order — answer-before-question is nonsense, because the answer was caused by reading the question. But two unrelated people each saying "good morning" at the same moment? It doesn't matter who appears first; nobody is confused if different phones momentarily show them in different orders. Causal consistency enforces the first rule and relaxes the second.
Example: The research system COPS ("Clusters of Order-Preserving Servers," from the 2011 paper "Don't Settle for Eventual") provides scalable causal+ consistency across data centers. It attaches explicit dependency metadata to each write so a replica can wait until everything that write depends on has arrived before exposing it. This is why a comment never shows up before the post it replies to, even across continents.

Why it matters. Causal consistency is a sweet spot: it stays available during network partitions (unlike linearizability) yet still rules out the most jarring anomalies (effects appearing before causes). It is often called the strongest consistency that a system can offer while remaining always-available.

6.6 Eventual consistency

What it is. The weakest commonly-used model. The only promise: if writes stop coming in, then after enough time all replicas will converge to the same value. That's it. While writes are still flowing, reads on different replicas may return different, stale values, and there is no promise about ordering. "Eventually" has no guaranteed deadline — usually milliseconds, occasionally much longer.

Why it exists. It buys you maximum speed and maximum availability. A read can be answered by the nearest replica with zero coordination (~1 ms), and the system keeps serving reads and writes even when nodes are cut off from each other. You trade freshness for performance and uptime — exactly the AP corner of the CAP theorem.

How it works. Replicas accept writes locally and gossip changes to each other in the background. When two replicas discover they disagree (a conflict), they resolve it with a rule — common ones are last-write-wins (the write with the highest timestamp wins) or merging with vector clocks / CRDTs (data types mathematically designed to merge without conflict).

Analogy: Email or postal mail. If you and your friend each mail a letter, you don't see each other's letter instantly. They arrive after a delay, possibly out of order. But if you both stop sending, eventually you'll both have received every letter and you'll agree on the full picture. The mail system is highly available (you can always drop a letter in the box) but never "live."
Example: Amazon DynamoDB defaults to eventually consistent reads because most reads tolerate slightly stale data, and a strongly consistent read costs twice the read capacity. Apache Cassandra is eventually consistent at its core but tunable: you choose how many replicas must acknowledge each read (R) and write (W). If you pick W + R > replication factor (e.g. QUORUM reads and writes with RF=3, so W=R=2), you regain strong consistency for that operation at the cost of more coordination. The amazon Dynamo paper (2007) and these systems are the reason eventual consistency became mainstream.
Common mistake: Believing "eventual" means "a few milliseconds, guaranteed." It is not a time bound. Under load or partition, "eventually" can stretch into seconds or longer. Never write code that assumes a write you just made is already visible on another replica.

6.7 Client-centric (session) guarantees

Plain eventual consistency is often too surprising — for example, you save your profile, reload the page, and your change is gone (because the reload hit a different, stale replica). Client-centric guarantees (also called session guarantees) fix the worst surprises for a single client's own session, without forcing global coordination. The four classic guarantees come from the Bayou project at Xerox PARC and the 1994 paper "Session Guarantees for Weakly Consistent Replicated Data" by Terry et al. A "session" is just one client's sequence of operations over time.

GuaranteePlain-English promiseExample of the bug it prevents
Read-your-writesAfter you write something, your own later reads will see it (or something newer).You change your password, then immediately log in — and the old password still works because your read hit a stale replica.
Monotonic readsOnce you've seen a value, you never see an older one later. Time only moves forward for you.You refresh a thread, see 10 comments, refresh again and see only 8 — the count went backwards.
Monotonic writesYour writes are applied in the order you made them.You set name = "Bob" then name = "Robert", but a replica applies them backwards and you end up as "Bob".
Writes-follow-readsA write you make after reading something is ordered after what you read. (Preserves cause→effect across the system.)You read a post, then reply to it, but your reply propagates to a replica that doesn't have the post yet — a reply to nothing.
Analogy: These four are the "don't gaslight one user" rules. The world (other people's data) can be fuzzy and lag behind, but your own experience should be self-consistent: what you typed is still there, your view never rewinds, your edits land in order, and your replies never precede what they replied to. It's like a notebook only you write in — even if the shared library is chaotic, your personal notebook must always make sense to you.

Why it matters. Session guarantees are how real apps make eventual-consistency stores feel sane to users. They are cheap (often implemented by "sticking" a client to one replica, or by the client carrying a small version vector of what it has already seen) and they eliminate the most infuriating, support-ticket-generating glitches without paying for full linearizability.

6.8 Putting it together: the comparison table

ModelGuarantee (what a read can return)Cost / latencyAvailable under partition?Example systemsWhen to choose it
Linearizability (strong) Always the latest write; respects real-time order; acts like one copy. Highest — quorum/leader round-trip per op (tens to hundreds of ms cross-region). No (CP — must stop to stay correct). etcd, ZooKeeper, Google Spanner, single-leader RDBMS Locks, leader election, bank balances, inventory counts, anything where stale = wrong.
Sequential One global order everyone agrees on; program order kept; not real-time. High, but no real-time clock constraint. No (still needs total-order agreement). Older CPU memory models; some replicated logs When all observers must agree on one order but exact freshness isn't required.
Causal Cause-before-effect preserved everywhere; concurrent writes may differ per node. Moderate — track dependencies (vector clocks), no global lock. Yes (strongest model that stays available). COPS, MongoDB causal sessions, AntidoteDB Social feeds, comment threads, collaborative apps — kill "effect before cause" anomalies cheaply.
Client-centric (session) Sane within your own session (read-your-writes, monotonic, etc.); world may still be fuzzy. Low — sticky replica or small client-side version vector. Yes. Bayou; bolt-on layer over Cassandra/DynamoDB Layer on top of eventual stores to remove the most jarring per-user glitches.
Eventual Maybe stale, any order; converges only after writes stop. Lowest — answer from nearest replica, no coordination (~1 ms). Yes (AP — always answers). DynamoDB (default), Cassandra (default), Riak, DNS Shopping carts, view counts, caches, like counts — high scale, staleness is harmless.

6.9 Tie-back to CAP and PACELC

The ladder is really the CAP/PACELC tradeoff viewed from the read side. CAP says when the network partitions, you must choose Consistency or Availability. PACELC (Daniel Abadi, 2010) extends it: if Partitioned, choose Availability or Consistency; Else (normal operation), choose Latency or Consistency. In other words, the consistency/latency tradeoff is paid even when nothing is broken.

  • Linearizability = more coordination. Every operation talks to a majority, so it is slower (worse latency, the "L" in PACELC) and must stop during partitions (gives up "A" in CAP). It is the PC/EC corner.
  • Eventual = less coordination. Answer locally, gossip later. Fast and always-available (the PA/EL corner), but reads can be stale.
  • Causal sits in the available band: it is the strongest contract you can keep while still answering during a partition.
Key takeaway: Moving up the ladder buys you simpler reasoning and fresher data, paid for in coordination, latency, and lost availability. Moving down buys speed and uptime, paid for with staleness and weird orderings your code must handle. There is no free lunch — only a deliberate choice per piece of data.

6.10 Real-world: choose the model per use case

The professional move is not "pick one model for the whole system." It is "pick the weakest model that is still correct for this specific data." Strong where it must be, weak where you can get away with it.

  • Shopping cart → eventual is fine. If two devices add items and a replica is briefly stale, no harm — you merge the carts later (this is exactly what Amazon's Dynamo did, even merging deleted items back in rather than losing an add). Customers tolerate "your cart updated a second late." They do not tolerate "the store was down."
  • Bank balance / product inventory → need linearizability. If a read can return a stale balance, you allow double-spends; if two buyers both read "1 left" from stale replicas, you oversell the last unit. Here staleness is a real money bug, so you pay for strong consistency (single leader or quorum, e.g. Spanner or a primary RDBMS).
  • Social feed / comments → causal is the right fit. You don't need the absolute latest global state, but a reply must never appear before the post it replies to. Causal consistency enforces exactly that ordering while staying fast and available at huge scale.
  • "My profile changes vanished on reload" → add read-your-writes (a session guarantee) on top of an otherwise-eventual store. Cheap fix, huge UX win.

Common mistakes & misconceptions

Common mistake: Treating "consistency" as a single binary (consistent vs. not). It is a ladder of named contracts. Saying "we need a consistent database" is meaningless until you say which model — linearizable, causal, eventual, etc.
Common mistake: Assuming sequential consistency or "strong-sounding" models guarantee fresh reads. Only linearizability adds the real-time rule. Sequential consistency guarantees one agreed order, which can still lag reality.
Common mistake: Believing "eventual consistency" comes with a time guarantee. "Eventually" has no deadline. Under partition it can be long. Never code as if a just-made write is already globally visible.
Common mistake: Confusing the "C" in ACID with the "C" in CAP. ACID-C means "the data obeys your integrity rules within a transaction"; CAP-C means linearizability across replicas. They are unrelated guarantees that share a letter.
Common mistake: Thinking strong consistency is "the safe default, just always use it." It costs latency on every operation (PACELC's "else" branch) and sacrifices availability during partitions. Defaulting everything to linearizable can make a system needlessly slow and fragile.
Common mistake: Forgetting that concurrent (non-causal) writes legitimately appear in different orders on different nodes under causal/eventual models. That is not a bug — it is the contract. Your conflict-resolution logic must expect it.

Best practices

Best practice: Choose consistency per data type, not per system. Use linearizability for money, inventory, locks, and identity; use causal or eventual for feeds, carts, counters, and caches. Most real systems mix several models.
Best practice: Default to the weakest model that is still correct for the use case, then strengthen only where a real bug appears. Weak-by-default keeps you fast and available; you pay for strength only where staleness causes harm.
Best practice: On eventually-consistent stores, add client-centric (session) guarantees — especially read-your-writes and monotonic reads — to eliminate the "my change disappeared / the list went backwards" glitches that generate support tickets.
Best practice: Use tunable consistency where the database offers it (e.g. Cassandra's per-query R/W levels, DynamoDB's optional strong reads). Pay the strong-read price only on the specific operations that need it.
Best practice: Make the chosen model explicit in code and docs ("inventory reads are linearizable; feed reads are causal"). An accidental, undocumented consistency model is a future outage. Test it — Jepsen-style fault-injection tests verify your system actually honors the contract it claims.
Best practice: When you must be strong and global, budget for the latency it forces (quorum round-trips, Spanner's commit-wait). Place leaders/quorums near the clients that need fresh reads to keep that cost bounded.

Section summary

  • A consistency model is a contract stating which values a read may legally return given the writes that occurred — it is a spectrum, not a yes/no.
  • Linearizability (strong): behaves like one up-to-date copy; reads always see the latest write and respect real-time order. Strongest, slowest, unavailable under partition (CP). Used by etcd, ZooKeeper, Spanner.
  • Sequential consistency: one global order everyone agrees on, preserving each client's program order — but not tied to real time, so reads can be consistent yet stale.
  • Causal consistency: cause-before-effect is preserved everywhere (tracked via vector clocks); concurrent writes may differ per node. The strongest model that stays available under partition. Great for feeds and comments (COPS).
  • Eventual consistency: only promises replicas converge after writes stop; reads may be stale and unordered. Fastest and most available (AP). Used by DynamoDB and Cassandra by default.
  • Client-centric (session) guarantees — read-your-writes, monotonic reads, monotonic writes, writes-follow-reads — make a weak store feel sane for one user's own session, cheaply (from the Bayou project).
  • CAP/PACELC tie-in: up the ladder = more coordination = slower and less available; down the ladder = faster and more available but stale. The latency tradeoff is paid even when the network is healthy (PACELC's "else").
  • In practice: shopping carts → eventual; bank balance/inventory → linearizable; social feeds/comments → causal; "my edit vanished" → add read-your-writes. Pick the weakest model that is still correct for each piece of data.

Sources:

Continue reading