Queue mirroring in RabbitMQ Classic is a surprisingly brittle high-availability feature, and migrating away from it often reveals hidden complexities in your cluster’s network and configuration.

Let’s see what this looks like in practice. Imagine you have an application that publishes messages to a mirrored queue my_mirrored_queue across two nodes, rabbit_node_1 and rabbit_node_2.

On rabbit_node_1:

rabbitmqctl cluster_status

You’d see rabbit_node_1 and rabbit_node_2 listed as running.

And rabbit_node_1 would show:

rabbitmqctl list_queues name messages_ready messages_uncommitted consumers state --vhosts | grep my_mirrored_queue

Output might look like: my_mirrored_queue 100 0 2 running

On rabbit_node_2, you’d see a similar list_queues output, indicating it’s also holding a mirror.

Now, if you wanted to migrate this to a non-mirrored queue, you’d typically declare a new queue my_new_queue on one node, publish to it, and then switch consumers. The expectation is a smooth transition.

The problem this solves is the inherent single point of failure in non-mirrored queues. Queue mirroring, at its core, is a distributed consensus problem, but implemented via direct network connections between nodes for replication.

Internally, when you declare a mirrored queue, the master node establishes a persistent connection to each mirror node. All incoming messages are written to the master’s disk, and then replicated over these connections to the mirrors. Consumers connect to the master, which then acknowledges messages only after they’ve been successfully replicated.

Here are the levers you control:

  • Queue Declaration Arguments: ha-mode (all, exactly, nodes), ha-params (specific nodes), ha-sync-mode (automatic, manual).
  • Node Network Connectivity: Crucial for replication. If nodes can’t reach each other, mirroring breaks.
  • Disk Space: Mirroring writes to disk on all mirror nodes.
  • Message TTL/Queue Length Limits: These also need to be maintained across mirrors.

The one thing most people don’t realize is that the ha-sync-mode: automatic setting, while convenient, can lead to significant delays during initial sync or recovery. If a mirror node comes back online after an outage, it will attempt to catch up by replaying all the messages that occurred while it was down. This can block new message ingress on the master if the mirror can’t keep up, effectively making the "highly available" queue unavailable for writes.

The next concept you’ll grapple with is managing the lifecycle of these mirrored queues, particularly during rolling upgrades or when decommissioning nodes.

Want structured learning?

Take the full Rabbitmq course →