Pulsar transactions don’t actually guarantee exactly-once processing in the way most people assume; they guarantee that a transactional producer will either successfully commit all its messages to a topic, or none of them, preventing partial writes.

Let’s see this in action. Imagine we have a transactional producer txProducer and a consumer consumer reading from topicA.

// Producer side
PulsarClient client = PulsarClient.builder()
    .serviceUrl("pulsar://localhost:6650")
    .build();

Producer<String> txProducer = client.newProducer(Schema.STRING)
    .enableTransactional(true)
    .create();

// Start a transaction
Transaction transaction = client.newTransaction()
    .withTransactionTimeout(1, TimeUnit.MINUTES)
    .start();

// Send messages within the transaction
txProducer.newMessage()
    .value("message1")
    .send();
txProducer.newMessage()
    .value("message2")
    .send();

// Commit the transaction
transaction.commit();
txProducer.close();
client.close();

// Consumer side
Consumer<String> consumer = client.newConsumer(Schema.STRING)
    .topic("topicA")
    .subscriptionName("my-sub")
    .subscribe();

Message<String> msg = consumer.receive();
// Process msg
consumer.acknowledge(msg);
consumer.close();
client.close();

The core problem Pulsar transactions solve is idempotent writes within a transactional scope. Without transactions, if a producer sends two messages and crashes before acknowledging both, a consumer might only see one. With transactions, the producer either successfully commits both messages, and the consumer sees both, or the producer fails to commit, and the consumer sees neither. This atomicity is key.

The internal mechanism relies on a transaction coordinator. When you start a transaction, the client requests a transaction ID from the coordinator. Messages sent within that transaction are initially staged and associated with this transaction ID. Only upon a successful commit call does the coordinator mark the transaction as committed, making the messages visible to consumers. A abort call simply discards the staged messages. The consumer’s acknowledge operation is separate from the producer’s transaction commit. The consumer acknowledges messages it has successfully processed, not messages that have been committed by the producer. This distinction is crucial for achieving end-to-end exactly-once semantics when combined with careful consumer logic.

The real power comes when you combine transactions with Pulsar’s schema registry and idempotent producers. If your consumer logic is also idempotent (meaning processing the same message multiple times has the same effect as processing it once), then a transactional producer committing messages, followed by a consumer acknowledging them, achieves exactly-once end-to-end. The Pulsar transaction guarantees the producer’s commit is atomic. The consumer’s idempotent processing ensures that even if it receives a message twice (e.g., due to a consumer restart and redelivery), the end result is as if it were processed only once.

The transactionTimeout is a critical parameter. If a transaction remains open for longer than this timeout without being committed or aborted, the transaction coordinator will automatically abort it. This prevents "zombie" transactions from holding resources indefinitely. Setting this too short can lead to premature aborts for long-running operations, while setting it too long can mask issues and delay cleanup. The default is 1 minute, but you’ll often need to tune this based on your expected processing times.

A common misconception is that consumer.acknowledge() interacts directly with the producer’s transaction. It doesn’t. The consumer acknowledges messages it has processed. If a consumer acknowledges a message, and then the producer’s transaction is aborted, the consumer will eventually see that message again. This is why the consumer must also be designed for idempotency. The producer’s transaction ensures atomicity of the write, not atomicity of the end-to-end flow without consumer-side guarantees.

The next hurdle you’ll encounter is handling transaction failures and retries. What happens if transaction.commit() fails? You need a robust strategy to retry the commit, or eventually abort, while ensuring you don’t violate idempotency on the consumer side.

Want structured learning?

Take the full Pulsar course →