GPG Encryption: Beyond Basic Symmetric Keys

Postgres data encryption using GPG isn’t about encrypting data within the database itself, but rather encrypting the files that Postgres uses for its data storage.

Let’s see how this looks in practice. Imagine you have a Postgres data directory, say /var/lib/postgresql/14/main/. You want to encrypt the contents of this directory so that even if someone gains filesystem access, they can’t read the actual data files (like base/12345/12346).

First, you’d need GPG installed and a key pair generated.

# Generate a new GPG key
gpg --full-generate-key

Once you have your key, you can encrypt a file.

# Encrypt a single data file
echo "some data" > testfile.txt
gpg --encrypt --recipient "Your Name" --output testfile.txt.gpg testfile.txt
cat testfile.txt.gpg # See the encrypted binary output
rm testfile.txt # Remove the original unencrypted file

To decrypt it:

gpg --decrypt --output testfile.txt testfile.txt.gpg
cat testfile.txt # See the original data again

The problem this solves is protecting sensitive data at rest when the underlying storage might be compromised. If your server’s hard drive is stolen or an attacker gains root access and can read the filesystem, encrypted files remain unreadable without the GPG private key.

Internally, GPG uses symmetric encryption (like AES) with a randomly generated session key, and then encrypts that session key with your public key. This is efficient for large files.

The primary lever you control is the GPG key you use for encryption. You’ll need to ensure the private key corresponding to the public key used for encryption is available on the server where Postgres is running, but only when Postgres needs to access the data. This typically means decrypting the files on-the-fly or during a backup/restore process.

The way Postgres interacts with these encrypted files is indirect. Postgres itself doesn’t know or care that the files are encrypted. It just reads and writes blocks of data from the filesystem. The encryption and decryption happen before Postgres tries to read a block or after it writes one, managed by a separate process or script.

For example, you might mount an encrypted filesystem (using dm-crypt or ecryptfs) where your Postgres data directory resides. Or, you could use a backup strategy where backups are encrypted with GPG. A more advanced setup might involve a FUSE (Filesystem in Userspace) driver that encrypts/decrypts on the fly, but this adds significant complexity and potential performance overhead.

A common misunderstanding is that you can simply encrypt the entire data directory with gpg -r <key> -o data.tar.gz.gpg data.tar.gz. While this works for backups, it’s not a practical way to keep a live database running because Postgres needs to read and write individual files constantly. The encryption/decryption would have to happen at a block level, which is what dedicated encrypted filesystem solutions provide.

The most surprising true thing about this approach is that Postgres itself remains completely unaware of the encryption. It’s like wrapping your house in a Faraday cage; the house’s internal plumbing and electrical systems don’t change, they just can’t be affected by external electromagnetic fields.

The next hurdle is managing the lifecycle of the GPG private key, especially in relation to database availability and disaster recovery.