UUIDs as primary keys are often touted as the "modern" choice, but they’re actually a step backward in many common PostgreSQL scenarios, primarily due to how they interact with B-tree indexes.
Let’s see a UUID primary key in action. Imagine a simple users table:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
username VARCHAR(50) UNIQUE NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
INSERT INTO users (username) VALUES ('alice');
INSERT INTO users (username) VALUES ('bob');
INSERT INTO users (username) VALUES ('charlie');
SELECT * FROM users WHERE id = 'a1b2c3d4-e5f6-7890-1234-567890abcdef';
When you run that SELECT query, PostgreSQL has to traverse the B-tree index on the id column. Unlike sequential BIGINTs, UUIDs are random. This means that each new UUID inserted into the table is likely to be placed in a completely different leaf node of the B-tree.
The core problem is index fragmentation and bloat. When you insert a UUID, it’s usually a random value. PostgreSQL’s B-tree index tries to keep things sorted. With BIGINTs, new values are appended to the end of the index, leading to a compact, well-ordered structure. With UUIDs, each new insertion might go into a different part of the index, scattering data. This leads to:
- Increased Index Size: Random writes cause more page splits in the B-tree, making the index larger than it needs to be for the same number of rows compared to a sequential
BIGINT. - Cache Inefficiency: When you query for a specific UUID, PostgreSQL might have to read more index pages from disk into memory because they are not clustered together logically. This is especially true for the default UUID v4, which is entirely random.
- Slower Inserts: B-tree page splits are computationally more expensive than simply appending a new record.
This doesn’t mean UUIDs are never good. They shine when you need to generate unique IDs outside of the database, in distributed systems where you can’t rely on a central sequence generator. For example, if your application generates an ID before inserting it into the database, a UUID is essential.
-- Application generates this UUID
INSERT INTO users (id, username) VALUES ('f0e1d2c3-b4a5-6789-0123-456789abcdef', 'dave');
The gen_random_uuid() function in PostgreSQL is a good choice for generating UUIDs if you must use them as primary keys, especially UUID v7 (if available via an extension) which aims to provide some sequentiality. However, for most single-database, OLTP workloads, BIGINT is still the king of primary keys.
The most surprising reason people often choose UUIDs is for perceived "scalability" or "future-proofing" against needing to merge databases, but they often underestimate the performance hit on typical operations, which can be a much larger bottleneck than the theoretical need to merge.
The next problem you’ll likely encounter is understanding how to efficiently query and index JSONB data in PostgreSQL.