The Pulsar client library is reporting a "Connection Already Closed" error because it’s attempting to use a connection that the broker has already terminated.
Common Causes and Fixes
-
Idle Connection Timeout on Broker:
- Diagnosis: Check the Pulsar broker configuration for
connectionTimeoutMs. The default is often 60000ms (60 seconds). If your client application is idle for longer than this, the broker will close the connection. - Fix: Increase
connectionTimeoutMsin the broker’sconf/broker.conffile. For example, to set it to 5 minutes:
Restart the Pulsar broker for the change to take effect. This prevents the broker from aggressively closing idle client connections.connectionTimeoutMs=300000
- Diagnosis: Check the Pulsar broker configuration for
-
Idle Connection Timeout on Client Network:
- Diagnosis: Network infrastructure between the client and broker (e.g., load balancers, firewalls) can also enforce idle timeouts. If these are shorter than the broker’s
connectionTimeoutMsor the client’s application logic, they can terminate the TCP connection without either the client or broker being aware. - Fix: Configure your network devices (load balancers, firewalls) to have idle timeouts that are longer than your expected maximum idle period, or at least longer than the broker’s
connectionTimeoutMs. The exact configuration depends on the specific network hardware. This ensures the intermediate network path doesn’t prematurely terminate the TCP session.
- Diagnosis: Network infrastructure between the client and broker (e.g., load balancers, firewalls) can also enforce idle timeouts. If these are shorter than the broker’s
-
Client Application Reusing Producer/Consumer Objects:
- Diagnosis: If your application code creates a
ProducerorConsumerinstance, then closes it, but later tries to use that same instance again, you’ll hit this error. The underlying connection associated with that closed object is gone. - Fix: Ensure that each time you need to send or receive messages, you are using a new
ProducerorConsumerobject if the previous one was explicitly closed. If the connection is managed by aPulsarClientinstance, and that client is closed, all its associated producers/consumers become invalid. Always create a newPulsarClientif the old one has been closed.
- Diagnosis: If your application code creates a
-
Client Application Closing the
PulsarClientInstance:- Diagnosis: The
PulsarClientobject is the primary handle for all connections to the Pulsar cluster. If your application code callspulsarClient.close()and then subsequently attempts to use anyProducerorConsumerobjects that were created by that client, they will fail with this error. - Fix: Do not call
pulsarClient.close()until your application is completely finished with all Pulsar operations. If you need to temporarily stop operations, consider managing the lifecycle of your producers and consumers instead of closing the entire client. For example, you might close a specific producer if it’s no longer needed, but keep thePulsarClientalive.
- Diagnosis: The
-
Network Instability / Frequent Disconnections:
- Diagnosis: Intermittent network drops can cause the underlying TCP connection to be reset. The Pulsar client library might not immediately detect this closure until it attempts to send data on the broken socket.
- Fix: Implement robust error handling and retry logic in your client application. For Java, use
PulsarClient.newClient(...).connectionPerBroker().build()to ensure each broker has its own connection, reducing the blast radius of a single connection failure. For other languages, consult their respective client library documentation for connection pooling and retry strategies. This makes the client more resilient to transient network issues by attempting to re-establish connections automatically.
-
Broker Restart/Failover:
- Diagnosis: If a Pulsar broker that your client is connected to restarts or undergoes a failover, its connections will be terminated. The client will then receive the "Connection Already Closed" error on its next operation.
- Fix: Ensure your client application is configured with a list of multiple Pulsar service URLs (e.g.,
pulsar://broker1:6650,broker2:6650,broker3:6650). The Pulsar client will automatically attempt to connect to other available brokers in the list if the current one becomes unavailable. This provides high availability by allowing the client to seamlessly switch to a healthy broker.
-
keepAliveIntervalConfiguration:- Diagnosis: The
keepAliveIntervalclient configuration (or its equivalent in other languages) dictates how often the client sends a TCP keep-alive packet. If this is set too high, or if network devices aggressively filter out such packets, the connection might appear idle to the OS or network devices, leading to premature closure. - Fix: Set a reasonable
keepAliveIntervalon the client, e.g.,pulsarClient.newPulsarClient(serviceUrl, PulsarClientBuilder.keepAliveInterval(60, TimeUnit.SECONDS)). This ensures that the connection is periodically refreshed with traffic, signaling to intermediate network devices and the OS that the connection is still active, thus preventing idle timeouts.
- Diagnosis: The
After fixing these issues, you might encounter Topic is not found errors if the metadata for the topic hasn’t fully propagated after a broker restart or if the topic was accidentally deleted.