The most surprising thing about Pi-hole high availability is that it often looks exactly like a single Pi-hole instance to the clients, masking a complex dance of synchronization and failover behind the scenes.
Let’s see it in action. Imagine this setup: two identical Raspberry Pis, each running Pi-hole. They’re connected to the same network, and their IP addresses are registered with the same DNS server entry in your router’s DHCP settings.
# On Pi-hole 1 (e.g., 192.168.1.10)
sudo pihole status
# On Pi-hole 2 (e.g., 192.168.1.11)
sudo pihole status
You’d expect clients to pick one or the other randomly, or based on which one responds first. But the magic happens when you introduce a Virtual IP (VIP) address. This VIP is the single IP address your clients query. Only one Pi-hole instance "owns" this VIP at any given time.
Here’s a simplified keepalived.conf on both nodes. This is the heart of the failover mechanism.
On Pi-hole 1 (/etc/keepalived/keepalived.conf):
vrrp_script chk_pihole {
script "/usr/local/bin/pihole status | grep 'not enabled' || exit 1"
interval 2
weight 20
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass mysecretpassword
}
virtual_ipaddress {
192.168.1.50/24 dev eth0
}
track_script {
chk_pihole
}
}
On Pi-hole 2 (/etc/keepalived/keepalived.conf):
vrrp_script chk_pihole {
script "/usr/local/bin/pihole status | grep 'not enabled' || exit 1"
interval 2
weight 20
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass mysecretpassword
}
virtual_ipaddress {
192.168.1.50/24 dev eth0
}
track_script {
chk_pihole
}
}
Notice the priority values: Pi-hole 1 has 100, Pi-hole 2 has 90. The higher priority instance claims the VIP (192.168.1.50). If Pi-hole 1 goes down, keepalived on Pi-hole 2 detects this (via VRRP advertisements or the script failing) and takes over the VIP.
The chk_pihole script is crucial. It checks if Pi-hole is running (pihole status should not output "not enabled"). If the script fails (Pi-hole is down), keepalived on the active node will relinquish the VIP. The weight parameter in vrrp_script adds to the instance’s priority when the script succeeds.
This setup solves the problem of a single point of failure. If your primary DNS server (your Pi-hole) dies, clients won’t lose internet connectivity because the redundant Pi-hole seamlessly takes over the IP address clients are configured to use.
But it’s not just about failover. To truly be redundant, the data needs to be synchronized. Pi-hole’s gravity database (lists of domains to block) and its query logs should ideally be consistent. You can achieve this using rsync or a shared filesystem like NFS. A common approach is to have one Pi-hole as the primary for updates and use rsync to copy gravity and logs to the secondary.
# Example rsync command run periodically on the primary Pi-hole
rsync -avz --delete /etc/pihole/gravity.db user@<secondary_pihole_ip>:/etc/pihole/gravity.db
rsync -avz --delete /var/log/pihole.log user@<secondary_pihole_ip>:/var/log/pihole.log
This ensures that when the secondary Pi-hole takes over, it has the most up-to-date blocklists.
The one thing that often trips people up is the state of the query logs. While gravity synchronization is common, keeping query logs perfectly in sync in real-time is more complex and often not strictly necessary for basic redundancy. If the primary fails, the secondary might have slightly older log data until the next sync. For pure data consistency, you’d look into more advanced database replication or a shared log storage.
The next hurdle you’ll likely face is managing configuration changes across both nodes. How do you update Pi-hole itself, or change settings like upstream DNS servers, without causing inconsistencies or requiring manual intervention on both machines?