Concurrent Access Control: Beyond Locks

Postgres doesn’t actually block readers when a writer is active; it lets them see a consistent snapshot of the data from before the write even started.

Let’s watch this happen with a simple scenario. We’ll use psql for this.

First, open two psql terminals. In the first one, start a transaction and select some data, but don’t commit yet:

psql -d mydatabase -c "BEGIN; SELECT * FROM users WHERE id = 1;"

Now, in the second psql terminal, try to read that same id = 1 row while the first transaction is still open:

psql -d mydatabase -c "SELECT * FROM users WHERE id = 1;"

Notice that the second query returns immediately. It shows the data as it was before the first transaction began its work, even if the first transaction is about to modify it. This is the magic of Postgres’s Multiversion Concurrency Control (MVCC).

MVCC is Postgres’s core mechanism for handling concurrent access. Instead of a single, constantly updated version of a row, Postgres maintains multiple versions of each row. When you update a row, Postgres doesn’t overwrite the old version; it creates a new version and marks the old one as "dead" for transactions that started before the update. When you delete a row, it’s not immediately removed; it’s marked as "dead."

This system elegantly solves the reader-writer contention problem. Readers are given a "snapshot" of the database as it existed when their transaction began. Writers, on the other hand, create new versions. This means readers never block writers, and writers don’t block readers.

Here’s a breakdown of the key components:

Transactions: Every operation in Postgres happens within a transaction. Transactions are ACID compliant (Atomicity, Consistency, Isolation, Durability). The ISOLATION LEVEL of a transaction dictates how it interacts with other concurrent transactions. The default is READ COMMITTED.
Row Versions: Each row in Postgres has a system-level header that includes xmin and xmax. xmin is the transaction ID that inserted the row. xmax is the transaction ID that deleted or updated the row. When a transaction reads a row, Postgres checks xmin and xmax against its own transaction ID and the transaction IDs of other active transactions to determine which version is visible to it.
Visibility Rules: A row version is visible to a transaction if:
- The inserting transaction (xmin) has committed.
- The deleting/updating transaction (xmax) has not committed, or the transaction is in the future relative to the current transaction’s start time.
- The current transaction’s snapshot can "see" the xmin and xmax transaction IDs.
Locks: While MVCC handles most read/write and write/write concurrency, explicit locks are still necessary for certain operations and to enforce specific application-level constraints.
- ROW EXCLUSIVE lock (taken by UPDATE, DELETE, INSERT): Prevents other ROW EXCLUSIVE locks on the same row but allows readers.
- SHARE UPDATE EXCLUSIVE lock: Prevents SHARE UPDATE EXCLUSIVE and EXCLUSIVE locks, but allows readers and ROW EXCLUSIVE locks.
- EXCLUSIVE lock: Prevents all other locks, including reads (unless they are using SERIALIZABLE isolation).

Let’s see a lock in action. In the first terminal:

psql -d mydatabase -c "BEGIN; SELECT * FROM users WHERE id = 1 FOR UPDATE;"

This acquires a ROW EXCLUSIVE lock on the row. Now, in the second terminal, try to read that same row with FOR UPDATE:

psql -d mydatabase -c "BEGIN; SELECT * FROM users WHERE id = 1 FOR UPDATE;"

This second command will now block until the first transaction commits or rolls back. This is because FOR UPDATE requests a lock that conflicts with the ROW EXCLUSIVE lock held by the first transaction.

The VACUUM process is critical for MVCC. Over time, dead row versions accumulate. VACUUM reclaims the space occupied by these dead versions and updates the system catalogs. If VACUUM (or autovacuum) doesn’t run often enough, tables can bloat, and performance can degrade significantly because Postgres has to scan through more row versions to find the visible one.

The most surprising aspect of MVCC is how it fundamentally shifts the concurrency model from "locking everything" to "allowing concurrent reads by default." This isn’t just a minor optimization; it’s a complete paradigm change that allows high-throughput transactional systems without the traditional reader-writer bottlenecks. The system works by allowing transactions to see a consistent, unchanging view of the data as it existed at the precise moment their transaction began, rather than waiting for locks to be released on the current, volatile state. This isolation is managed through transaction IDs and the visibility rules applied to row versions.

The next hurdle you’ll likely encounter is understanding transaction isolation levels beyond READ COMMITTED, particularly REPEATABLE READ and SERIALIZABLE, and how they interact with MVCC and potential anomalies like phantom reads.