Pulsar doesn’t actually "rebalance" topic bundles; it distributes them based on a weighted algorithm that aims for even resource utilization across brokers.

Let’s see this in action. Imagine you have a Pulsar cluster with three brokers: broker-1, broker-2, and broker-3. You’ve created a topic named persistent://my-tenant/my-namespace/my-topic.

Initially, when the topic is first created, its data (messages) is stored in what Pulsar calls a "bundle." This bundle is assigned to one of the brokers. If you have many topics, they’ll be distributed across your brokers.

Here’s what the bundle distribution might look like right after a few topics are created:

{
  "my-tenant/my-namespace/my-topic-1": "broker-1",
  "my-tenant/my-namespace/my-topic-2": "broker-2",
  "my-tenant/my-namespace/my-topic-3": "broker-3"
}

Now, let’s say my-topic-1 becomes incredibly popular. It starts receiving a massive volume of messages, and its load on broker-1 increases significantly. Meanwhile, broker-2 and broker-3 are relatively idle. Pulsar’s internal load balancer detects this imbalance.

When a bundle becomes too large or too heavily utilized, Pulsar might split it. This split doesn’t happen instantly but is triggered by metrics like the number of topics in a bundle, the total size of the bundle, or the number of active producers/consumers.

Let’s say my-topic-1 is split into two new bundles: my-topic-1-0x0000000000000000-0x7FFFFFFFFFFFFFFF and my-topic-1-0x8000000000000000-0xFFFFFFFFFFFFFFFFFF. Pulsar will then try to assign these new bundles to brokers that have available capacity.

The distribution might then look like this:

{
  "my-tenant/my-namespace/my-topic-1-0x0000000000000000-0x7FFFFFFFFFFFFFFF": "broker-1",
  "my-tenant/my-namespace/my-topic-1-0x8000000000000000-0xFFFFFFFFFFFFFFFFFF": "broker-2",
  "my-tenant/my-namespace/my-topic-2": "broker-2",
  "my-tenant/my-namespace/my-topic-3": "broker-3"
}

Notice how one of the split bundles from my-topic-1 moved to broker-2, evening out the load. This automatic redistribution is the core of how Pulsar manages topic distribution.

The problem Pulsar solves with this mechanism is ensuring that no single broker becomes a bottleneck. If one broker is overwhelmed, it can lead to increased latency for producers and consumers, message loss, or even broker instability. By distributing topic bundles and allowing them to split and migrate, Pulsar aims to maintain even load across the cluster, providing consistent performance and high availability.

The key levers you control are primarily through configuration. The loadBalancer settings in your Pulsar broker configuration file (broker.conf) are crucial. Parameters like loadBalancerBrokerMaxTopics, loadBalancerBrokerMaxMemory, and loadBalancerBrokerMaxNetUsage define the thresholds at which Pulsar considers a broker "loaded" and might trigger bundle splitting or migration. You can also manually trigger a bundle distribution by restarting brokers or using the Pulsar Admin API to force an update of the load balancer’s state, though this is generally not recommended for day-to-day operations.

What most people don’t realize is that Pulsar’s load balancing isn’t just about moving entire topics. When a bundle is split, Pulsar doesn’t magically know which part of the data goes where. Instead, it creates two new, empty bundles and then uses a "bundle migration" process. The existing data is not copied immediately. Instead, the old bundle is marked as "decommissioned," and new producers and consumers for that logical topic will start writing to/reading from one of the new bundles. The actual data migration happens in the background as readers that were attached to the old bundle re-attach to the new ones, consuming from their last acknowledged position. This is a critical performance optimization, as it avoids a massive, synchronous data copy operation.

The next concept you’ll likely encounter is understanding how Pulsar handles durable storage and replication for these distributed bundles, especially when dealing with failures.

Want structured learning?

Take the full Pulsar course →