The Redis Pub/Sub subsystem is reporting that a channel’s message queue is full, meaning publishers are unable to send messages to that channel because subscribers are not consuming them fast enough.
This typically happens when a subscriber crashes, becomes unresponsive, or simply can’t keep up with the message volume, causing the Redis server to buffer an ever-increasing number of messages for that specific channel. Redis, by default, will keep buffering messages for a channel until it runs out of memory or hits a configured limit, at which point publishers will start to see errors.
Here are the common causes and how to address them:
-
Subscriber Unresponsiveness/Crash:
- Diagnosis: Check your subscriber application logs for errors, exceptions, or signs of a deadlock. Use
redis-clito monitor the number of active subscribers for the channel:
Look for your specific channel and see if the subscriber count is zero or if messages aren’t being acknowledged. You can also useredis-cli PSUBSCRIBE __redis__:* # This will show all channels and their subscriber countsINFO pubsubto get a general overview of channel activity. - Fix: Restart or fix the bug in your subscriber application. If it’s a performance issue, optimize the subscriber’s message processing logic.
- Why it works: A functioning subscriber actively
PUNSUBSCRIBEs or implicitly unsubscribes when its connection closes, freeing up buffer space. A healthy subscriber continuously consumes messages, preventing buildup.
- Diagnosis: Check your subscriber application logs for errors, exceptions, or signs of a deadlock. Use
-
Subscriber Logic Overload/Inefficiency:
- Diagnosis: Even if the subscriber is running, it might be too slow to process messages. Monitor the subscriber’s CPU and memory usage. If its message processing loop is complex or involves blocking I/O operations without proper concurrency, it can fall behind.
- Fix: Optimize the subscriber’s message handling code. Implement batching if possible, use asynchronous processing (e.g., with
asyncioin Python, Goroutines in Go), or scale out your subscriber instances. - Why it works: By processing messages faster or in parallel, the subscriber drains the Redis channel buffer more quickly, allowing publishers to send new messages without being blocked.
-
Redis Memory Exhaustion:
- Diagnosis: The Redis server itself might be running out of memory, and the channel buffer is consuming a significant portion of it. Check
redis-cli INFO memoryforused_memoryandmaxmemory. Also, look at themem_fragmentation_ratio. - Fix: Increase
maxmemoryinredis.confif you have available RAM, or free up memory by removing unnecessary keys or optimizing data structures. If memory is genuinely exhausted, you might need to upgrade your Redis instance or server. - Why it works: Redis needs memory to operate, including buffering Pub/Sub messages. If
maxmemoryis reached, Redis starts evicting keys based on itsmaxmemory-policy. If Pub/Sub buffers are part of the memory pressure, increasingmaxmemorygives Redis more room.
- Diagnosis: The Redis server itself might be running out of memory, and the channel buffer is consuming a significant portion of it. Check
-
maxmemory-policyConfiguration:- Diagnosis: If
maxmemoryis set, Redis uses amaxmemory-policyto decide what to evict. If the policy is not suitable (e.g.,noevictionwhen memory is tight), it can lead to situations where Redis refuses new writes, including Pub/Sub messages. Checkredis.confformaxmemory-policy. - Fix: Consider changing
maxmemory-policyto something likeallkeys-lruorvolatile-lruif you want Redis to proactively evict less-used keys to make room for new data, including Pub/Sub messages. However, be cautious, as this can lead to data loss. A better fix is usually to address the root cause of memory pressure. - Why it works: A more aggressive eviction policy allows Redis to reclaim memory from less critical data, ensuring that operations like Pub/Sub message buffering can continue, albeit at the risk of losing other data.
- Diagnosis: If
-
Redis Configuration
pubsub_max_messages_per_channel(if applicable/set):- Diagnosis: While not a common default setting, if you’ve explicitly configured
pubsub_max_messages_per_channel(or a similar directive in older Redis versions), this limits the buffer size per channel. If this limit is too low for your traffic, it can cause the "queue full" error. - Fix: Increase the value of
pubsub_max_messages_per_channelinredis.confor remove the directive to use Redis’s default (which is effectively memory-bound). - Why it works: This directly increases the buffer size that Redis is allowed to maintain for individual Pub/Sub channels, giving subscribers more time to catch up before publishers are blocked.
- Diagnosis: While not a common default setting, if you’ve explicitly configured
-
Network Latency Between Publisher and Subscriber:
- Diagnosis: High network latency can make a subscriber appear unresponsive, even if its processing logic is sound. Use
pingortraceroutebetween the Redis server and the subscriber instance, and between the subscriber and any external services it interacts with during message processing. - Fix: Improve network connectivity, reduce hops, or deploy subscribers closer to the Redis instances.
- Why it works: Lower latency ensures that messages are delivered to the subscriber quickly and that acknowledgments (if any) are sent back promptly, allowing the subscriber to maintain a healthy processing rate and keep the channel buffer clear.
- Diagnosis: High network latency can make a subscriber appear unresponsive, even if its processing logic is sound. Use
-
Large Message Payloads:
- Diagnosis: If individual messages are very large, a subscriber might take longer to deserialize and process them, even if the processing logic itself is simple. Analyze the size of messages being published.
- Fix: Compress message payloads before publishing, or redesign the message format to be more compact. Alternatively, if large data is being sent, consider sending a reference (e.g., a URL to an object store) instead of the data itself within the message.
- Why it works: Smaller messages are processed faster by the subscriber, reducing the time it takes to drain the channel buffer and preventing it from becoming full.
After resolving the immediate issue, you might encounter a "Redis connection refused" error if the Redis server was restarted during troubleshooting or if network issues were the primary culprit.