The most surprising thing about query execution plans is that they’re not just a report of what did happen, but a prediction of what will happen, and a set of instructions for how it should happen.
Let’s see this in action. Imagine a users table and an orders table, and we want to find all users who placed an order in the last 30 days.
SELECT u.name, u.email
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.order_date >= NOW() - INTERVAL '30 days';
Running EXPLAIN ANALYZE on this query gives us a detailed breakdown. Here’s a snippet (simplified for clarity):
"Hash Join (cost=7.00..100.50 rows=1000 width=50) (actual time=0.100..5.200 rows=500 loops=1)"
" Hash Cond: (o.user_id = u.id)"
" Buffers: shared hit=5000 read=1000"
" -> Seq Scan on orders o (cost=0.00..50.25 rows=2000 width=20) (actual time=0.050..1.500 rows=1500 loops=1)"
" Filter: (order_date >= NOW() - INTERVAL '30 days')"
" Buffers: shared hit=3000 read=500"
" -> Hash (cost=5.00..5.00 rows=100 width=30) (actual time=0.040..0.040 rows=80 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 25kB"
" Buffers: shared hit=1000 read=500"
" -> Seq Scan on users u (cost=0.00..5.00 rows=100 width=30) (actual time=0.020..0.030 rows=80 loops=1)"
" Buffers: shared hit=1000 read=500"
"Planning Time: 0.500 ms"
"Execution Time: 5.800 ms"
This plan tells us the database chose a Hash Join. It first scanned the orders table (a Seq Scan), filtering for recent orders. Then, it built a hash table in memory from the users table. Finally, it joined these two together. The cost is an estimated cost, while actual time and rows show what actually happened. Buffers indicate disk I/O.
The problem this solves is obvious: efficiently retrieving related data from multiple tables. The magic is how the database figures out the best way to do it. It considers many different join strategies (Hash Join, Merge Join, Nested Loop Join), access methods (Sequential Scan, Index Scan), and orderings. It estimates the cost of each, picks the cheapest, and then executes it.
The levers you control are primarily through:
- Indexing: Adding indexes to columns used in
JOINconditions (orders.user_id,users.id) andWHEREclauses (orders.order_date) can drastically change the plan from a slowSeq Scanto a fastIndex Scan. - Statistics: The query planner relies on statistics about your data (how many distinct values are in a column, their distribution). Outdated statistics can lead to terrible plan choices. Commands like
ANALYZE users;andANALYZE orders;(orVACUUM ANALYZE) update these. - Configuration: Parameters like
work_mem(how much memory a sort or hash operation can use) can influence whether a join is done in memory or spills to disk, impacting performance and thus the plan chosen. - Query Rewriting: Sometimes, the way you write the query itself can influence the plan. Explicitly choosing join types or simplifying complex subqueries can guide the planner.
The EXPLAIN ANALYZE output shows Buffers: shared hit=5000 read=1000. This means 5000 blocks of data were found in the database’s shared buffer cache (already in RAM), and 1000 blocks had to be read from disk. If read is high, it’s a strong indicator of I/O bottlenecks, and often points to missing indexes or inefficient table scans.
What most people miss is that the planner’s cost model is an estimate. It’s not perfect. Sometimes, a plan that looks more expensive on paper is faster in reality because the estimates for row counts or data distribution were off. This is where EXPLAIN ANALYZE becomes invaluable – it shows the actual performance, not just the predicted.
The next concept to explore is how to interpret different join types (Nested Loop, Merge, Hash) and when each is most appropriate.