RabbitMQ’s credential validation failed because the Erlang distribution port (epmd) on the RabbitMQ node couldn’t reach the Erlang distribution port on another node it was trying to connect to, usually for clustering or management operations. This is critical because Erlang nodes must be able to talk to each other using their distribution protocol to form clusters or for management tools to interact with them.

Here are the common culprits and how to fix them:

  1. Firewall Blocking EPMD Port (4369):

    • Diagnosis: On the node receiving the connection attempt, check if port 4369 is open.
      sudo ufw status verbose
      # or
      sudo iptables -L -n | grep 4369
      
    • Fix: Open port 4369 on the firewall of the node that RabbitMQ is trying to connect to.
      sudo ufw allow 4369/tcp
      # or
      sudo iptables -A INPUT -p tcp --dport 4369 -j ACCEPT
      
    • Why it works: EPMD (Erlang Port Mapper Daemon) listens on TCP port 4369 to register and resolve Erlang nodes. If this port is blocked, nodes cannot discover each other to establish the Erlang distribution connection.
  2. Incorrect NODENAME in rabbitmq-env.conf:

    • Diagnosis: Check the NODENAME setting in /etc/rabbitmq/rabbitmq-env.conf on all nodes in the cluster. Ensure it’s unique and resolvable.
      cat /etc/rabbitmq/rabbitmq-env.conf
      # Example output: NODENAME=rabbit@my-server.domain.com
      
      Then, from the node that’s failing, try to ping the NODENAME of the other node using ping. If ping fails or resolves to the wrong IP, that’s the issue.
    • Fix: Set NODENAME to a fully qualified domain name (FQDN) or an IP address that is resolvable and unique across all nodes.
      # In /etc/rabbitmq/rabbitmq-env.conf
      NODENAME=rabbit@<unique_hostname_or_ip>
      
      Restart RabbitMQ: sudo systemctl restart rabbitmq-server.
    • Why it works: The NODENAME is how Erlang nodes identify themselves. If it’s not resolvable via DNS or /etc/hosts to the correct IP address, a node cannot find or connect to another node’s EPMD.
  3. Network Unreachability / DNS Resolution Issues:

    • Diagnosis: From the RabbitMQ node that is failing to connect, try to ping the hostname or IP address specified in the NODENAME of the target node.
      ping rabbit@other-node.domain.com
      # or
      ping 192.168.1.10
      
      Also, check DNS resolution directly:
      dig rabbit@other-node.domain.com +short
      # or
      nslookup 192.168.1.10
      
    • Fix: Ensure that all RabbitMQ nodes can resolve each other’s hostnames or IP addresses. This might involve configuring /etc/hosts files on all nodes or fixing DNS records.
      # Example entry in /etc/hosts on node A
      192.168.1.11  rabbit@nodeB.domain.com nodeB
      
      Restart RabbitMQ after changes.
    • Why it works: Erlang distribution relies on the underlying network and DNS to locate nodes. If a node cannot be reached or its name doesn’t resolve to the correct IP, the distribution handshake will fail.
  4. Erlang Cookie Mismatch:

    • Diagnosis: The Erlang cookie is a shared secret that Erlang nodes use for authentication. Check the cookie file (/var/lib/rabbitmq/.erlang.cookie by default) on all nodes.
      sudo cat /var/lib/rabbitmq/.erlang.cookie
      
      The contents must be identical on all nodes.
    • Fix: Ensure the .erlang.cookie file has the same content on all nodes. If they differ, copy the content from one node to all others. Make sure the file permissions are 0600 and owned by the rabbitmq user.
      # On each node, after copying the correct cookie:
      sudo chown rabbitmq:rabbitmq /var/lib/rabbitmq/.erlang.cookie
      sudo chmod 0600 /var/lib/rabbitmq/.erlang.cookie
      
      Restart RabbitMQ on all nodes.
    • Why it works: The Erlang cookie acts as a shared password for the Erlang distribution protocol. Nodes with different cookies are considered untrusted and cannot communicate.
  5. RabbitMQ Not Running or Failed to Start:

    • Diagnosis: Check the status of the RabbitMQ service on both nodes.
      sudo systemctl status rabbitmq-server
      
      Look for any error messages in the journal:
      sudo journalctl -u rabbitmq-server -n 100 --no-pager
      
    • Fix: If RabbitMQ is not running, start it. If it failed to start, investigate the journal logs for specific errors (e.g., disk space, permissions, configuration syntax).
      sudo systemctl start rabbitmq-server
      
    • Why it works: The Erlang distribution protocol requires the RabbitMQ server process (which includes the Erlang VM and EPMD) to be running on each node. If it’s not, no connections can be established.
  6. EPMD Not Listening on the Correct Interface:

    • Diagnosis: On the node that is supposed to be receiving connections, check which network interfaces EPMD is listening on.
      sudo netstat -tulnp | grep 4369
      # or
      sudo ss -tulnp | grep 4369
      
      Look for 0.0.0.0:4369 (listening on all interfaces) or a specific IP. If it’s 127.0.0.1:4369, it’s only listening locally.
    • Fix: By default, EPMD listens on all interfaces. If it’s been configured to listen only on 127.0.0.1 (e.g., via ERL_DIST_PORT environment variable or specific Erlang configuration), you need to adjust it to listen on the network interface that other nodes will connect to. This is often controlled by the NODENAME itself if it’s an IP address. For clustering, NODENAME should typically resolve to an IP accessible by other nodes.
    • Why it works: EPMD needs to be accessible on the network interface that the connecting node is trying to reach. If it’s bound only to localhost, remote nodes cannot establish a connection.

After resolving these, you’ll likely hit the next common issue: "Node down" errors in the management UI or logs because the nodes are still trying to establish the full RabbitMQ cluster membership, which requires ports 5672 (AMQP) and potentially 15672 (Management UI) to also be open and accessible between nodes.

Want structured learning?

Take the full Rabbitmq course →