Pulsar exposes per-topic metrics through Prometheus, but the default configuration often hides this granular data, leading people to believe it’s not available.

Let’s see what this looks like in practice. Imagine we have a Pulsar cluster running with two topics: persistent://public/default/topic-a and persistent://public/default/topic-b. Without proper configuration, a Prometheus scrape of the Pulsar broker might only show aggregate metrics like pulsar_broker_topics_total. This counts all topics but doesn’t differentiate which topics are contributing to that count or their individual performance.

To get per-topic metrics, we need to enable the broker-metrics feature and configure Prometheus to scrape the JMX exporter endpoint.

Here’s a typical broker.conf snippet you’d find on a Pulsar broker:

# Default JMX exporter configuration
jmx.exporter.port=7000
jmx.exporter.host=localhost

This configures the JMX exporter, but it doesn’t automatically expose per-topic metrics. We need to explicitly enable them.

The key is to modify the broker.conf to include the broker-metrics configuration. This tells the broker to expose more detailed metrics, including those tied to specific topics.

Add or modify these lines in broker.conf:

# Enable per-topic metrics
broker-metrics.enabled=true
# JMX Exporter configuration (ensure this is present and correct)
jmx.exporter.port=7000
jmx.exporter.host=localhost

After restarting the Pulsar broker(s) with this configuration, the JMX exporter will start exposing metrics for individual topics. Your Prometheus server, configured to scrape http://<broker_host>:7000/metrics, will now receive data like:

  • pulsar_broker_topic_message_in_total{topic="persistent://public/default/topic-a"}
  • pulsar_broker_topic_message_out_total{topic="persistent://public/default/topic-a"}
  • pulsar_broker_topic_bytes_in_total{topic="persistent://public/default/topic-a"}
  • pulsar_broker_topic_bytes_out_total{topic="persistent://public/default/topic-a"}
  • pulsar_broker_topic_storage_size{topic="persistent://public/default/topic-a"}

And similarly for topic-b and any other topics in your cluster.

The Prometheus configuration needs to point to the JMX exporter endpoint on each broker. If you’re using a prometheus.yml file, it would look something like this:

scrape_configs:
  - job_name: 'pulsar-brokers'
    static_configs:
      - targets:
          - 'broker1.example.com:7000'
          - 'broker2.example.com:7000'
          - 'broker3.example.com:7000'
    metrics_path: /metrics

Common Causes of Missing Per-Topic Metrics:

  1. broker-metrics.enabled is false or missing: This is the most direct cause. The broker simply isn’t instructed to generate granular topic metrics.

    • Diagnosis: Check broker.conf on your brokers.
    • Fix: Set broker-metrics.enabled=true and restart the broker.
    • Why it works: This flag is the primary switch for enabling the detailed topic metric generation within the broker’s internal metrics reporters.
  2. JMX Exporter Not Running or Misconfigured: The broker might be running, but the JMX exporter, which is responsible for translating JMX metrics into HTTP endpoints Prometheus can scrape, might be disabled, on the wrong port, or not bound correctly.

    • Diagnosis: Verify the JMX exporter port (default 7000) is open and accessible from your Prometheus server. Check broker logs for JMX exporter startup errors. curl http://<broker_host>:7000/metrics should return data.
    • Fix: Ensure jmx.exporter.port is set correctly in broker.conf and that the port is not blocked by a firewall. Restart the broker.
    • Why it works: The JMX exporter is the bridge between Pulsar’s internal JMX metrics and Prometheus’s HTTP scraping mechanism. If it’s down, no metrics are exposed.
  3. Prometheus Scrape Configuration Incorrect: Prometheus might be configured to scrape the wrong port, hostname, or path for the JMX exporter.

    • Diagnosis: Review your prometheus.yml file. Check the targets list for the Pulsar job and ensure the ports match the jmx.exporter.port in broker.conf.
    • Fix: Correct the targets in prometheus.yml to point to the correct broker IPs/hostnames and the JMX exporter port (e.g., broker1.example.com:7000). Reload Prometheus configuration.
    • Why it works: Prometheus needs the correct address to know where to fetch metrics from.
  4. Network Issues Between Prometheus and Brokers: Even if configured correctly, network connectivity problems can prevent Prometheus from reaching the JMX exporter endpoints.

    • Diagnosis: Use ping or traceroute from the Prometheus server to the broker IPs. Check firewall rules on both the Prometheus server and the broker nodes.
    • Fix: Open necessary ports (e.g., 7000) in firewalls between Prometheus and the brokers. Ensure DNS resolution is working if using hostnames.
    • Why it works: Data cannot flow if the network path is blocked or broken.
  5. Broker Not Fully Started or Metrics Reporter Uninitialized: In rare cases, a broker might start but fail to initialize its metrics reporters correctly, leading to no metrics being exposed via JMX.

    • Diagnosis: Examine the broker logs (server.log) for any errors related to JMX, metrics, or reporter initialization during startup.
    • Fix: Address any errors found in the broker logs. This might involve checking Java version compatibility, JVM arguments, or Pulsar’s internal state. Restart the broker.
    • Why it works: The metrics reporters are internal components; if they fail to start, they cannot collect or expose data.
  6. Resource Constraints on Broker: If a broker is severely resource-constrained (CPU, memory), the JMX exporter process or the metrics reporting threads might be slow to respond or even crash, leading to intermittent or absent metrics.

    • Diagnosis: Monitor CPU and memory usage on the broker nodes. Check for OOM errors in system logs or broker logs.
    • Fix: Increase resources allocated to the broker or optimize its workload.
    • Why it works: Components need adequate resources to function correctly; starvation causes failures.

Once per-topic metrics are flowing, you’ll be able to build dashboards in Grafana (or your preferred visualization tool) that show message rates, storage usage, and other vital statistics for each individual topic, allowing for much finer-grained operational insight and troubleshooting.

The next hurdle you’ll likely encounter is understanding how to use these per-topic metrics to identify and diagnose specific performance bottlenecks, such as a single topic saturating a broker’s resources.

Want structured learning?

Take the full Pulsar course →