Pulsar’s dead letter topic (DLQ) isn’t just a place to dump bad messages; it’s an active queue for retrying failed processing.
Let’s watch a message get red-flagged and sent to the DLQ.
Imagine a Pulsar topic persistent://public/default/my-topic with a consumer group my-consumer-group.
# Produce a message that will cause a processing error
pulsar-client produce my-topic -m "This message will fail" -n 1
# Start a consumer that simulates an error
# (This is a conceptual example; you'd have actual consumer code)
# In your consumer code, when you receive "This message will fail":
# // Simulate an error during processing
# throw new RuntimeException("Failed to process message");
# Pulsar's client library will detect the unacknowledged message after a timeout.
After the consumer fails to acknowledge the message (either by throwing an exception or by timing out), Pulsar, configured with a DLQ, will automatically redirect this message.
How it Works Internally:
When a Pulsar consumer receives a message, it processes it and then acknowledges it back to the broker. If the consumer fails to acknowledge the message within a configured ackTimeout (e.g., 1 minute), the broker assumes the message failed processing. If a DLQ is configured for the topic, the broker will then take that unacknowledged message and publish it to a special, automatically created topic named my-topic-my-consumer-group-DLQ. The original message’s metadata, including its original topic and subscription, is preserved.
The Problem This Solves:
Without a DLQ, messages that fail processing are lost or endlessly redelivered to the same failing consumer, creating a retry loop. A DLQ provides a structured way to isolate these problematic messages. It allows you to:
- Inspect: See exactly which messages are failing and why.
- Retry: Manually or automatically re-process messages from the DLQ after fixing the underlying issue.
- Discard: If a message is truly unrecoverable, it can be safely discarded from the DLQ.
Configuration and Control:
DLQs are configured at the subscription level. You don’t configure the topic itself for DLQ; you configure the subscription to send failed messages to a DLQ.
# Example pulsar-broker.conf snippet
brokerDeduplicationEnabled: true
# ... other pulsar configurations ...
# Enable DLQ functionality for subscriptions
deadLetterMessage:
enabled: true
# Default DLQ topic name format: <topic>-<subscription>-DLQ
# If this is not set, the default format will be used.
# dlqTopicName: "my-custom-dlq-prefix"
The ackTimeout is crucial. If your consumer takes longer than this to process a message, Pulsar will consider it failed. You set this client-side when creating your consumer.
// Java Consumer Example
Consumer<byte[]> consumer = pulsarClient.newConsumer()
.topic("persistent://public/default/my-topic")
.subscriptionName("my-consumer-group")
.ackTimeout(60, TimeUnit.SECONDS) // 60 seconds timeout
.subscribe();
When a message lands in the DLQ, it’s just another Pulsar topic. You can consume from it like any other topic.
# Consume messages from the DLQ
pulsar-client consume persistent://public/default/my-topic-my-consumer-group-DLQ -s my-dlq-consumer-group -n 0
You can then re-publish messages from the DLQ to the original topic after you’ve fixed the issue.
# Produce a message from the DLQ back to the original topic
pulsar-client produce persistent://public/default/my-topic -m "This is a retried message" -n 1
The maxRedeliverCount is another important setting. This is a subscription property that dictates how many times a message will be redelivered to a consumer before it’s considered for DLQ processing (if DLQ is enabled). If maxRedeliverCount is reached for a message, then it will be sent to the DLQ. If DLQ is not enabled, reaching maxRedeliverCount simply means the message is no longer redelivered to that subscription.
# Set maxRedeliverCount when creating a subscription
pulsar-admin subscriptions create persistent://public/default/my-topic my-consumer-group --type Exclusive --maxRedeliverCount 5
A common misconception is that the DLQ is a separate, special broker component. In reality, it’s just a regular Pulsar topic that Pulsar automatically creates and routes unacknowledged messages to based on subscription configuration. The naming convention topic-subscription-DLQ is the default, but this can be overridden in the broker configuration if you need a centralized DLQ naming scheme.
The next step after successfully processing messages from a DLQ is to implement a strategy for automatically replaying them back to the original topic once the fix is deployed.