The Redis master is down, and a failover is in progress because the Sentinel monitoring process detected that the master instance was no longer reachable and initiated the process of promoting a replica to become the new master.

Cause 1: Master Instance Actually Crashed or Became Unresponsive

Diagnosis: Check the Redis master’s process status. On the master server, run:

ps aux | grep redis-server

If the process is not running, it has crashed. Check Redis logs for crash reasons:

tail -n 100 /var/log/redis/redis-server.log

Look for OOM (Out Of Memory) errors, segmentation faults, or disk full messages.

Fix: If the instance crashed due to OOM, you need to increase Redis’s memory limit or reduce its memory usage.

  1. Increase maxmemory (if set): Edit redis.conf and set a higher maxmemory value. For example:
    maxmemory 8gb
    
    Then restart Redis: redis-cli shutdown followed by redis-server /etc/redis/redis.conf.
  2. Increase System Memory: If maxmemory was not set or is already high, the system might be out of RAM. Increase the server’s RAM or tune OS-level OOM killer settings (though this is a last resort).
  3. Address Disk Full: If logs indicate disk full, free up space or expand the disk.

Why it works: Redis needs memory to operate. If it runs out, it can crash or become unresponsive, triggering Sentinel. By providing more memory or freeing up existing space, Redis can function correctly.

Cause 2: Network Partition Between Sentinel and Master

Diagnosis: From a Sentinel machine, try to ping the master’s IP address directly.

ping <redis_master_ip>

If ping fails, or if redis-cli -h <redis_master_ip> -p 6379 ping returns Could not connect to Redis, there’s a network issue. Check firewall rules on both the master and Sentinel servers, and any network devices in between. On the master server, check its network interface status:

ip addr show

And check routes:

ip route show

Fix:

  1. Firewall Rules: Ensure that port 6379 (or your Redis port) is open for TCP traffic between the Sentinel(s) and the master. On ufw (Ubuntu/Debian):
    sudo ufw allow from <sentinel_ip> to any port 6379
    sudo ufw allow from <redis_master_ip> to any port 6379 # If master needs to reach sentinel
    
    On firewalld (CentOS/RHEL):
    sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="<sentinel_ip>/32" port port="6379" protocol="tcp" accept'
    sudo firewall-cmd --reload
    
  2. Routing/Network Configuration: Correct any misconfigured network interfaces, routes, or DNS issues that prevent communication.

Why it works: Sentinels monitor the master by sending PING commands over the network. If network connectivity is lost, Sentinels incorrectly assume the master is down. Restoring network path or opening firewall ports allows Sentinels to communicate with the master.

Cause 3: Master Instance is Overloaded and Not Responding to PINGs

Diagnosis: If the master is still running but unresponsive, check its CPU and network load. On the master server:

top -n 1 -c -i -P

Look for Redis processes consuming near 100% CPU. Also, check network traffic:

sar -n DEV 1 5

If Redis is consistently maxing out CPU or network bandwidth, it might not be able to respond to Sentinel’s PING commands in time.

Fix:

  1. Optimize Redis Workload: Identify slow commands or excessive traffic patterns. Use redis-cli --latency -h <redis_master_ip> -p 6379 to check latency.
  2. Scale Up/Out: Increase the master’s resources (CPU, RAM, network) or offload read traffic to replicas if applicable.
  3. Adjust Sentinel Timeout: Increase down-after-milliseconds in Sentinel configuration if the overload is temporary and acceptable. In sentinel.conf:
    down-after-milliseconds mymaster 10000
    
    This tells Sentinel to consider the master down only after it hasn’t responded for 10 seconds (default is 3000ms). Apply changes with sentinel reload-config.

Why it works: High load can prevent Redis from processing PING requests from Sentinels within the configured timeout. Either by reducing the load, increasing master capacity, or giving Sentinel more time to wait, the master becomes responsive again from Sentinel’s perspective.

Cause 4: Sentinel Configuration Errors (Incorrect Master IP/Port, Wrong Quorum)

Diagnosis: Examine the Sentinel configuration file (sentinel.conf on each Sentinel instance). Look for the sentinel monitor <master-name> <ip> <port> <quorum> directive. Ensure the <ip> and <port> correctly point to the current master’s address. Check the <quorum> value. If it’s too high for the number of active Sentinels, they might not be able to agree on a failover.

Fix:

  1. Correct Master Details: Update sentinel.conf with the correct IP and port for the master.
    sentinel monitor mymaster 192.168.1.100 6379 2
    
    After editing, reload the Sentinel configuration:
    redis-cli -p 26379 -h <sentinel_ip> sentinel reload-config
    
  2. Adjust Quorum: Ensure quorum is less than or equal to the number of Sentinels you have running. If you have 3 Sentinels, a quorum of 2 is usually appropriate.
    sentinel monitor mymaster 192.168.1.100 6379 2
    
    Reload Sentinel config.

Why it works: Sentinels need accurate information to monitor the correct master and agree on its state. Incorrect IP/port means they’re looking at the wrong target, and an incorrect quorum can prevent them from reaching consensus on a failover, even if the master is truly down.

Cause 5: Master Redis Version Incompatibility with Sentinel Version

Diagnosis: Check the Redis and Sentinel versions. On the master: redis-cli --version On the Sentinel: redis-cli -p 26379 --version While Redis Sentinel is generally robust across versions, very old versions of Sentinel might have issues communicating with very new Redis masters, or vice-versa, especially if there were significant protocol changes.

Fix: Upgrade both Redis master and Sentinel instances to the same, recent stable version.

  1. Upgrade Redis Master: Follow standard Redis upgrade procedures, typically involving downloading the new version, stopping the old instance, configuring the new one, and starting it.
  2. Upgrade Sentinel: Similarly, update the Sentinel binary and configuration, then restart the Sentinel process.

Why it works: Ensures a consistent communication protocol between Redis instances and their monitors, preventing subtle bugs or misunderstandings that could lead to false positives or failed failovers.

Cause 6: DNS Resolution Issues for Master Hostname

Diagnosis: If your Sentinel configuration uses a hostname for the master instead of an IP address:

sentinel monitor mymaster redis.example.com 6379 2

On the Sentinel machine, try resolving the hostname:

nslookup redis.example.com
dig redis.example.com

If these fail, or if they return an incorrect IP address, DNS is the problem.

Fix:

  1. Correct DNS Records: Update the DNS record for the master hostname to point to the correct IP address.
  2. Check Sentinel’s DNS Server: Ensure the Sentinel machine is configured to use a reliable DNS server. Check /etc/resolv.conf on the Sentinel machine.
  3. Use IP Address: As a workaround, change sentinel.conf to use the master’s IP address directly. Remember to reload Sentinel config.

Why it works: Sentinels rely on DNS to find the master. If the hostname doesn’t resolve to the correct IP, the Sentinel cannot reach the master it’s supposed to be monitoring.

The next error you’ll likely encounter if you fix this is * +sdown master <master-name> <ip> <port> <count> followed by * +resetting master <master-name> and * +succeeder master <master-name> <ip> <port>.

Want structured learning?

Take the full Redis course →