Upgrading PostgreSQL major versions is more about data migration than in-place modification, and the pg_upgrade tool is your orchestrator.
Let’s see pg_upgrade in action. Imagine you’re moving from PostgreSQL 14 to 15. You’ve installed PostgreSQL 15, but your old data directory is still with the 14 installation. You need to bring that data along.
First, ensure both old and new PostgreSQL clusters are stopped.
sudo systemctl stop postgresql@14-main
sudo systemctl stop postgresql@15-main
Now, identify your data directories. For a typical Debian/Ubuntu installation, these are usually /var/lib/postgresql/14/main and /var/lib/postgresql/15/main.
The core command is pg_upgrade. You’ll need to specify the binary paths for both old and new versions, and the data directory paths.
sudo -u postgres pg_upgrade \
--old-bindir=/usr/lib/postgresql/14/bin \
--new-bindir=/usr/lib/postgresql/15/bin \
--old-datadir=/var/lib/postgresql/14/main \
--new-datadir=/var/lib/postgresql/15/main \
--check
Running with --check first is crucial. It simulates the upgrade and reports potential issues without actually touching your data. If this comes back clean, you can proceed with the actual upgrade.
sudo -u postgres pg_upgrade \
--old-bindir=/usr/lib/postgresql/14/bin \
--new-bindir=/usr/lib/postgresql/15/bin \
--old-datadir=/var/lib/postgresql/14/main \
--new-datadir=/var/lib/postgresql/15/main
pg_upgrade works by creating new cluster files in the new data directory that point to the existing data files of the old cluster. It doesn’t copy the data itself. It essentially rewrites the control files and catalog entries to be compatible with the new major version. This is why it’s so much faster than a dump/restore.
After a successful upgrade, your old data directory is left untouched, which is a great safety net. The new data directory contains the upgraded cluster. You’ll then start the new PostgreSQL 15 service.
sudo systemctl start postgresql@15-main
And verify your databases are accessible.
sudo -u postgres psql -c "SELECT version();"
The upgrade process requires that the wal_level in your old cluster was at least minimal. If it was archive or hot_standby, pg_upgrade will still work, but it might be slower as it needs to process WAL files. The wal_level for the new cluster will be set appropriately by pg_upgrade.
One common pitfall is when extensions are not installed in the new PostgreSQL version. pg_upgrade will warn you about this. You need to manually install any extensions that were present in your old cluster into your new one before running pg_upgrade. This ensures that the catalog entries for those extensions are present in the new cluster.
Another point to consider is the data_checksums setting. If your old cluster had data checksums enabled, the new cluster will also have them enabled. If they were disabled, they will remain disabled. This is a setting that is preserved.
The most surprising thing about pg_upgrade is how it achieves its speed: it doesn’t copy your data. Instead, it performs a "hard link" or "copy on write" operation on the data files and then modifies the control files and system catalogs to reflect the new PostgreSQL version. This means the actual data blocks on disk remain in their original location until they are modified by the new PostgreSQL version.
After the upgrade, you’ll want to run ANALYZE on all your databases to update statistics for the query planner.
sudo -u postgres vacuumdb --all --analyze-in-stages
The next thing you’ll likely encounter is the need to update your application’s connection strings if you’re running PostgreSQL on a different port or hostname, or if you were relying on specific old behavior that has been deprecated.