Redpanda’s "exactly-once" transactions don’t actually guarantee exactly-once delivery; they guarantee idempotent producers and effectively-once consumers, which is a more practical and achievable guarantee.

Let’s see this in action. Imagine we have a simple producer that sends a message, and a consumer that processes it.

from kafka import KafkaProducer, KafkaConsumer
import json
import time

# Producer configuration
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    value_serializer=lambda v: json.dumps(v).encode('utf-8'),
    enable_idempotence=True,  # Crucial for exactly-once semantics
    acks='all',              # Ensure all replicas acknowledge the write
    retries=5,               # Retry on transient network issues
    max_in_flight_requests_per_connection=1 # Ensures ordered writes
)

# Consumer configuration (for demonstration, we'll manually commit)
consumer = KafkaConsumer(
    'my-topic',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    enable_auto_commit=False, # We'll control commits manually
    group_id='my-group'
)

def send_message(topic, message_data):
    try:
        future = producer.send(topic, value=message_data)
        # Block until the message is sent and acknowledged
        record_metadata = future.get(timeout=10)
        print(f"Sent message: {message_data} to topic {record_metadata.topic} partition {record_metadata.partition} offset {record_metadata.offset}")
    except Exception as e:
        print(f"Error sending message: {e}")

def process_message(message):
    # Simulate processing that might fail
    print(f"Processing message: {message.value}")
    # If this processing fails, we *don't* commit the offset.
    # The producer's idempotence will handle retries without duplication.
    time.sleep(1) # Simulate work
    print(f"Finished processing message: {message.value}")
    return True # Indicate successful processing

print("Starting producer...")
for i in range(5):
    send_message('my-topic', {'id': i, 'data': f'message-{i}'})
    time.sleep(0.5)

print("\nStarting consumer...")
for message in consumer:
    if process_message(message):
        consumer.commit() # Commit only after successful processing
        print(f"Committed offset for partition {message.partition} to {message.offset}")
    else:
        print(f"Processing failed for message: {message.value}. Not committing offset.")
        # The consumer will re-read this message on the next poll if not committed.

producer.flush()
producer.close()
consumer.close()

The core idea behind Redpanda’s (and Kafka’s) transactional guarantees is to separate the writing of data from the acknowledgment of its consumption. The producer side uses features like enable_idempotence=True, acks='all', and retries to ensure that a message, once successfully written and acknowledged by all replicas, is durable and won’t be lost due to transient errors. This is the producer’s guarantee of at-least-once delivery, but with idempotence, duplicate writes are handled internally by the broker.

The consumer side then takes over. By disabling auto_commit and manually committing offsets only after the message has been successfully processed, we achieve effectively-once consumption. If a consumer crashes after processing a message but before committing its offset, the next time it restarts, it will re-read that message. However, because the producer was idempotent, sending the same message again (or the broker retrying a previous send) won’t result in a duplicate written record if the first write was already acknowledged. The consumer logic itself must be designed to handle potential duplicates gracefully, which is where the "effectively-once" part comes in. It’s not about the message arriving once, but about its effect being applied once.

The real magic happens in the interplay between producer idempotence and consumer-managed commits. For instance, if a producer sends message-1 and it gets acknowledged by all replicas, but the consumer crashes before committing its offset for message-1, the producer’s internal mechanisms (using sequence numbers per partition) prevent message-1 from being written again as a new message if the producer retries. The consumer, upon restart, will fetch message-1 again. The consumer’s processing logic must then be designed to be idempotent. If the processing logic is, for example, an INSERT into a database that uses unique constraints on a message ID, then re-processing message-1 will simply result in a unique constraint violation, and the consumer can then commit the offset, effectively processing message-1 only once from a business logic perspective.

A common pitfall is misunderstanding the scope of enable_idempotence. It only applies to a single producer instance. If you have multiple producer instances writing to the same topic/partition, they don’t share idempotence state. For true transactional guarantees across multiple producers and consumers, Redpanda supports the full Kafka transactional API, which involves a Transaction Coordinator and explicit initiate_transaction, send_offsets_to_transaction, commit_transaction, and abort_transaction calls. This allows for atomic writes across multiple partitions and commits consumer offsets atomically with producer writes.

The most surprising aspect for many is that "exactly-once" in distributed systems is almost always a misnomer for "effectively-once" or "idempotent." The system guarantees that if an operation succeeds and is committed, it will have the same effect as if it happened only once, even if the underlying network or broker experiences failures that cause retries.

The next step in mastering Redpanda’s guarantees is exploring the full transactional API for atomic multi-partition writes and consumer offset commits.

Want structured learning?

Take the full Redpanda course →