RabbitMQ’s disk_free_limit alarm is preventing new messages from being published because the broker’s underlying storage is almost full.

Common Causes and Fixes

  1. Unacknowledged/Uncommitted Messages:

    • Diagnosis: Check the number of unacknowledged messages per queue. If these are significant, messages are being held in memory or disk by RabbitMQ waiting for client acknowledgments.
      rabbitmqctl list_queues name messages_unacknowledged messages_ready messages
      
    • Fix: Ensure your consumers are acknowledging messages promptly. If you’re using publisher confirms or transactions, ensure those are being processed. For a quick reset (use with extreme caution, as it drops messages), you can restart the RabbitMQ node.
      rabbitmqctl stop_app
      rabbitmqctl reset
      rabbitmqctl start_app
      
    • Why it works: Unacknowledged messages consume resources. Releasing them (by acknowledging them or resetting) frees up space.
  2. Large Message Payloads:

    • Diagnosis: Examine the size of messages in your queues. Large payloads can quickly fill disk space, especially if many messages are queued.
      rabbitmqctl list_queues name messages_ready memory
      
      (Note: memory here gives a rough idea; actual disk usage per message is complex. You’d typically look at queue depth and message rate to infer this.)
    • Fix: Optimize message payloads by reducing their size or serializing them more efficiently. Consider offloading large data to external storage and only sending references/URLs in RabbitMQ messages.
    • Why it works: Smaller messages consume less disk space.
  3. Persistent Messages Not Being Purged:

    • Diagnosis: Persistent messages are written to disk. If consumers die or are slow, these can accumulate indefinitely.
      rabbitmqctl list_queues name messages_ready messages_unacknowledged --vhost=<your_vhost>
      
      Then, investigate queues with high messages_ready and low messages_unacknowledged.
    • Fix: Implement Dead Letter Exchanges (DLX) and TTLs (Time-To-Live) to automatically remove or redirect messages that are not processed within a certain timeframe or after failing processing.
      # Example policy for TTL and DLX
      rabbitmqctl set_policy dlx_ttl "^my_queue_prefix" \
      '{"message-ttl": 60000, "dead-letter-exchange": "my_dlx"}' \
      --apply-to queues
      
    • Why it works: TTLs expire messages, and DLXs provide a place to send them, eventually leading to their removal.
  4. Unused Queues and Exchanges:

    • Diagnosis: Over time, applications might stop using certain queues or exchanges, leaving them to accumulate messages or just exist as metadata on disk.
      rabbitmqctl list_queues --vhost=<your_vhost>
      rabbitmqctl list_exchanges --vhost=<your_vhost>
      
      Look for queues/exchanges that haven’t had activity for a long time.
    • Fix: Identify and delete unused queues and exchanges.
      rabbitmqctl delete_queue <queue_name> --vhost=<your_vhost>
      rabbitmqctl delete_exchange <exchange_name> --vhost=<your_vhost>
      
    • Why it works: Removing unused objects directly reduces the amount of data RabbitMQ needs to manage.
  5. Internal RabbitMQ Logs/Database Files:

    • Diagnosis: RabbitMQ itself generates logs and internal database files (Mnesia) that can grow large, especially under heavy load or during restarts. Check the RabbitMQ data directory.
      # Find data directory (often /var/lib/rabbitmq/mnesia/rabbit@<hostname>/)
      rabbitmqctl environment | grep mnesia_dir
      # Check disk usage in that directory
      du -sh /var/lib/rabbitmq/mnesia/rabbit@<hostname>/
      
    • Fix: If these files are excessively large and you’re confident the broker is healthy, you might need to clear old log files or, in extreme cases, reconfigure mnesia (this is a more advanced operation and may require cluster coordination). A simple restart can sometimes clear transient log files.
      # Restart RabbitMQ to clear some transient logs/files
      systemctl restart rabbitmq-server
      
    • Why it works: Certain internal files are transient or can be safely pruned after a clean shutdown/restart.
  6. Underlying Disk Full:

    • Diagnosis: This is the most direct cause. The operating system reports the disk is full, triggering the alarm.
      df -h
      
      Check the filesystem where /var/lib/rabbitmq (or your configured data directory) resides.
    • Fix: Free up space on the underlying filesystem. This could involve deleting old system logs, application logs, temporary files, or increasing the disk size.
      # Example: remove old apt cache
      sudo apt autoremove
      sudo apt clean
      # Example: find and remove large old files
      find /var/log -type f -atime +30 -delete
      
    • Why it works: Provides the necessary physical space for RabbitMQ and the OS to operate.

After resolving the disk space issue, RabbitMQ will automatically clear the disk_free_limit alarm. The next error you might encounter if messages were blocked for too long is publisher_timeout or connection errors due to clients giving up.

Want structured learning?

Take the full Rabbitmq course →