Prometheus Exemplars let you jump directly from a metric’s time series to the specific trace that generated it, cutting through the noise of aggregated data.

Let’s see it in action. Imagine you’re seeing a spike in your http_requests_total metric for a particular endpoint, say /api/v1/users. This spike might be accompanied by a rise in latency, indicated by http_request_duration_seconds_bucket. You’d normally just see the aggregated count and latency distribution. But with exemplars, you can click on that spike in Grafana and see a list of individual requests. For each request, you’ll see its trace ID.

Here’s a simplified Prometheus configuration snippet showing how to enable exemplars:

scrape_configs:
  - job_name: 'my-app'
    static_configs:
      - targets: ['localhost:8080']
    exemplar:
      # Enable trace context propagation for exemplars
      enabled: true
      # Specify the HTTP header that carries the trace ID
      # This is common for OpenTelemetry and Jaeger
      trace_context:
        # This header is used by OpenTelemetry's W3C Trace Context
        # and is also understood by Jaeger and Zipkin
        - W3C
        # This header is also used by B3 propagation, common in Zipkin
        - B3

And here’s how your application (instrumented with OpenTelemetry, for example) would expose metrics with trace context:

// Assuming you have an HTTP handler
func usersHandler(w http.ResponseWriter, r *http.Request) {
	// Get the current trace context
	ctx := r.Context()
	span := trace.SpanFromContext(ctx)
	spanContext := span.SpanContext()
	traceID := spanContext.TraceID().String()

	// Record a metric with the trace ID as an exemplar
	// The OpenTelemetry Prometheus exporter will automatically
	// add trace context to metrics if available.
	// For manual instrumentation, you might use specific Prometheus client libraries
	// that support exemplar attachment.
	// Example using a hypothetical Prometheus client extension:
	// requestCounter.WithLabels(...).Observe(1, exemplar.TraceID(traceID))

	// ... rest of your handler logic ...
}

The magic happens when Prometheus scrapes these metrics. It sees the exemplar data attached to specific metric samples. This data, crucially, includes the trace ID and any other relevant labels (like span_id). Prometheus then stores these exemplars alongside the metric data. When you query Prometheus and display it in Grafana, the visualization layer can detect the presence of exemplars. Clicking on a point in a metric graph that has exemplars will reveal a list of these associated traces, allowing you to directly navigate to your distributed tracing system (like Jaeger or Tempo) by clicking on the trace ID.

The core problem exemplars solve is the "needle in a haystack" scenario when debugging performance issues. Before exemplars, you’d see a blip in a metric and have to manually search through potentially millions of trace IDs to find the one that corresponds to that specific metric observation. Exemplars eliminate this manual correlation. They are like a direct hyperlink from an aggregated metric to the granular, contextual information within a distributed trace.

The most surprising true thing about Prometheus exemplars is that they don’t require any special Prometheus server configuration beyond enabling the exemplar directive in your scrape configs. The heavy lifting is done by the application instrumentation and the Prometheus client libraries, which are responsible for capturing the trace context and attaching it as an exemplar to the metric sample at the point of observation. Prometheus itself simply acts as a conduit, storing and serving this exemplar data.

The default behavior for trace context propagation headers is quite robust. Prometheus supports W3C Trace Context (traceparent, tracestate) and B3 propagation (x-b3-traceid, x-b3-spanid, x-b3-sampled, x-b3-flags, x-b3-parentspanid). If your application and tracing backend use one of these standard formats, Prometheus can usually infer and attach the correct trace context to your metric exemplars without explicit configuration. You only need to specify trace_context in your Prometheus config if you’re using non-standard headers or want to be explicit.

The next logical step after correlating a metric spike to a trace is understanding why that trace was slow, which often involves looking at service dependencies and network latency between services.

Want structured learning?

Take the full Prometheus course →