Pulsar’s delayed message scheduling doesn’t actually hold messages in memory; it cleverly uses a separate, dedicated topic to manage the delivery timestamps.
Let’s see this in action. Imagine we want to send a message that should only be delivered 60 seconds from now.
# Assuming 'pulsar-admin' is in your PATH and configured
# Create a topic for delayed messages (this is internal to Pulsar)
pulsar-admin topics create persistent://public/default/delayed-message-schedule
# Producer sends a message with a delivery time 60 seconds in the future
# The key here is the 'x-pulsar-delivery-time' header
pulsar-client produce persistent://public/default/my-topic \
--property x-pulsar-delivery-time=60000 \
--message "This message will be delayed"
When you send that message, Pulsar doesn’t immediately put it into the my-topic log. Instead, it inspects the x-pulsar-delivery-time header. If it’s present and in the future, Pulsar writes the message to the delayed-message-schedule topic. The message written to this internal topic contains the original message payload and the target delivery timestamp.
Internally, Pulsar has a scheduled task that periodically scans this delayed-message-schedule topic. It reads messages from this topic and checks their delivery timestamps against the current time. If a message’s delivery time has arrived (or passed), Pulsar then re-publishes that message to its original destination topic (my-topic in our example).
This mechanism allows Pulsar to handle delayed messages without blocking producers or consumers. Producers send messages as usual, and consumers subscribe to the original topic. The "delay" is an internal routing and scheduling concern. The x-pulsar-delivery-time header is the key. It’s specified in milliseconds since the Unix epoch, or as a relative delay from the current time.
The core components involved are:
- Producer: Sets the
x-pulsar-delivery-timeheader. - Broker: Intercepts messages with the header, writes them to the
delayed-message-scheduletopic. - Scheduler Task (internal to Broker): Periodically scans
delayed-message-schedule, checks timestamps, and re-publishes. - Original Topic: Receives the message once its delivery time is met.
You can configure the frequency of this internal scan. The delayedMessageIntervalInMs setting in your broker’s configuration controls how often the broker checks the delayed-message-schedule topic. A lower value means more frequent checks and potentially more precise delivery times, but at the cost of increased CPU usage. A typical value might be 1000 (1 second) or 5000 (5 seconds).
# In your broker.conf file:
delayedMessageIntervalInMs=1000
The actual mechanics of how Pulsar re-publishes messages from the schedule topic to the destination topic involve creating a new entry in the destination topic’s ledger. This is a standard Pulsar write operation, ensuring durability and atomicity for the re-published message. The original message on the delayed-message-schedule topic is then acknowledged.
A common misconception is that delayed messages consume resources on the producer or client side. In reality, once the x-pulsar-delivery-time header is set and the message is sent, the responsibility shifts entirely to the Pulsar broker. The producer can disconnect, and the message will still be delivered at its scheduled time.
The next hurdle you’ll likely encounter is understanding how to handle message redelivery and idempotency when dealing with scheduled messages that might fail during the re-publishing phase.