The most surprising thing about Postgres’s jsonb is that it’s often faster than structured data for certain types of queries, even though it looks like a free-for-all.
Let’s see it in action. Imagine a table of user profiles, where some profile data is unstructured and needs to go into jsonb.
CREATE TABLE users (
id serial PRIMARY KEY,
name varchar(100),
profile jsonb
);
INSERT INTO users (name, profile) VALUES
('Alice', '{"email": "alice@example.com", "settings": {"theme": "dark", "notifications": true}}'),
('Bob', '{"email": "bob@example.com", "preferences": {"language": "en", "timezone": "UTC"}, "last_login": "2023-10-27T10:00:00Z"}'),
('Charlie', '{"email": "charlie@example.com", "settings": {"theme": "light", "notifications": false}, "admin": true}');
Now, let’s say we want to find all users with the "dark" theme. A naive query might look like this:
SELECT id, name
FROM users
WHERE profile -> 'settings' ->> 'theme' = 'dark';
Without an index, Postgres has to scan every row, parse the jsonb for each row, and check the value. This is where jsonb indexing shines.
Postgres offers GIN (Generalized Inverted Index) indexes for jsonb. A GIN index on the entire jsonb column is the most common and versatile:
CREATE INDEX idx_users_profile_gin ON users USING GIN (profile);
This index stores every key-value pair within the jsonb document. When you query for a specific key-value, Postgres can quickly look up the relevant entries in the GIN index without scanning the whole table or parsing every jsonb document. This is mechanically similar to how a full-text search index works, but for arbitrary JSON structures.
For queries that target specific paths within the jsonb document, you can create more specialized expression indexes. For example, to speed up queries specifically on the theme within settings:
CREATE INDEX idx_users_profile_theme ON users ((profile -> 'settings' ->> 'theme'));
This index only stores the values extracted at that specific path (profile -> 'settings' ->> 'theme'). It’s smaller and potentially faster for queries that only target this path, but less flexible than a full GIN index.
The real power comes from understanding how Postgres’s query planner uses these indexes. When you run EXPLAIN ANALYZE on a query involving jsonb, you’ll see if it’s using your GIN index (Bitmap Heap Scan with GIN Index Scan) or an expression index.
The most common mistake is assuming jsonb is inherently slow and avoiding it. In reality, for ad-hoc querying of semi-structured data, a well-indexed jsonb column can outperform traditional relational structures, especially when the query patterns are not known in advance or when dealing with deeply nested or varied JSON structures. The overhead of parsing and indexing is amortized over many queries, and the index lookup is extremely efficient.
If you’re frequently querying for the presence of a key, or for specific values at known paths, GIN indexes are your go-to. If you’re doing range queries or sorting on values within the jsonb, you might be better served by a different data model or by extracting those specific values into separate, indexed columns.
When querying nested boolean values, like profile -> 'settings' ->> 'notifications', Postgres will treat the extracted value as text. To perform a boolean comparison, you’ll need to cast it:
SELECT id, name
FROM users
WHERE (profile -> 'settings' ->> 'notifications')::boolean IS TRUE;
This might seem like a minor detail, but it’s critical for correct query logic and performance, as Postgres can’t efficiently use an index designed for text strings to look up boolean values unless you explicitly tell it how to convert.
The next hurdle people often encounter is understanding how to perform efficient updates and deletions on jsonb data, especially when only a small part of the document needs modification.