OpenTelemetry metrics are designed to be a universal language for telemetry, but the most surprising thing is how much their fundamental building blocks, the metric instruments, operate more like event streams than traditional time-series database metrics.

Let’s see this in action. Imagine we’re instrumenting a simple web service.

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import ConsoleMetricExporter
from opentelemetry.sdk.metrics.export.metric_reader import PeriodicExportingMetricReader
import time

# Initialize MeterProvider
provider = MeterProvider()
reader = PeriodicExportingMetricReader(ConsoleMetricExporter())
provider.add_metric_reader(reader)
metrics.set_meter_provider(provider)

meter = metrics.get_meter(__name__)

# Counter: A cumulative metric that only increases.
request_counter = meter.create_counter(
    "http.server.requests",
    description="Number of HTTP requests received",
    unit="1"
)

# Gauge: A metric that can arbitrarily go up and down.
active_connections_gauge = meter.create_gauge(
    "http.server.active_connections",
    description="Number of active connections",
    unit="1"
)

# Histogram: Records observations, allowing calculation of distribution statistics.
request_duration_histogram = meter.create_histogram(
    "http.server.request_duration",
    description="Duration of HTTP requests",
    unit="s"
)

# Simulate receiving requests
for i in range(5):
    request_counter.add(1, {"method": "GET", "status_code": "200"})
    active_connections_gauge.set(i + 1, {"endpoint": "/users"})
    request_duration_histogram.record(time.time() % 1.0, {"method": "GET", "endpoint": "/users"})
    time.sleep(0.5)

# Simulate connections dropping
for i in range(3):
    active_connections_gauge.set(2 - i, {"endpoint": "/users"})
    time.sleep(0.5)

# Explicitly flush metrics to see them immediately in the console exporter
reader.force_flush()

When you run this, you’ll see output in your console that looks like this:

{
  "resource": {
    "attributes": []
  },
  "scope": {
    "name": "__main__",
    "version": "unknown"
  },
  "metrics": [
    {
      "name": "http.server.requests",
      "description": "Number of HTTP requests received",
      "unit": "1",
      "data": {
        "data_points": [
          {
            "attributes": {
              "method": "GET",
              "status_code": "200"
            },
            "time": "2023-10-27T10:30:00.123456Z",
            "value": 5
          }
        ],
        "is_monotonic": true,
        "temporality": "DELTA"
      }
    },
    {
      "name": "http.server.active_connections",
      "description": "Number of active connections",
      "unit": "1",
      "data": {
        "data_points": [
          {
            "attributes": {
              "endpoint": "/users"
            },
            "time": "2023-10-27T10:30:02.678901Z",
            "value": 0
          }
        ],
        "temporality": "INSTANT"
      }
    },
    {
      "name": "http.server.request_duration",
      "description": "Duration of HTTP requests",
      "unit": "s",
      "data": {
        "data_points": [
          {
            "attributes": {
              "method": "GET",
              "endpoint": "/users"
            },
            "count": 5,
            "sum": 2.345678,
            "min": 0.123456,
            "max": 0.987654,
            "bucket_counts": [1, 2, 3, 4, 5],
            "explicit_bounds": [0, 5, 10, 20, 50, 100],
            "time": "2023-10-27T10:30:02.678901Z",
            "temporality": "DELTA"
          }
        ]
      }
    }
  ]
}

The core problem OpenTelemetry metrics solve is providing a standardized, vendor-neutral way to collect and export telemetry data, allowing you to avoid vendor lock-in and integrate with various backends (like Prometheus, Jaeger, Datadog, etc.) using a single instrumentation library.

Internally, each metric instrument (Counter, Gauge, Histogram) is associated with a Meter object. When you call methods like add(), set(), or record(), you’re not directly writing to a time-series database. Instead, you’re appending events to an in-memory buffer managed by the SDK. These events, along with their associated attributes (like {"method": "GET", "status_code": "200"}), are then aggregated by a MetricReader. The PeriodicExportingMetricReader in our example periodically collects these aggregated metrics and passes them to an Exporter (like ConsoleMetricExporter) for processing or sending to a backend.

The temporality field you see in the output is crucial. For Counters and Histograms, it’s typically DELTA, meaning the exporter will receive the change in value since the last export. The SDK internally tracks the cumulative value and calculates the delta. For Gauges, it’s usually INSTANT, meaning the exporter receives the current value at the time of export.

The data_points are the most important part. For a Counter, you get the cumulative value. For a Gauge, you get the current value. For a Histogram, you get count (total observations), sum (total of observed values), min and max, and importantly, bucket_counts based on explicit_bounds. The SDK automatically populates these buckets. You don’t explicitly define histogram buckets in the instrument creation; they are part of the aggregation process managed by the SDK.

A subtle but critical aspect of how OpenTelemetry handles metrics is its reliance on aggregation. When you create a metric, you’re not just creating a name; you’re also implicitly defining how its data points will be aggregated. For example, a Counter uses a Sum-based aggregation, a Histogram uses a Histogram-based aggregation (which includes counts, sums, and buckets), and a Gauge uses a LastValue-based aggregation. The SDK provides default aggregations, but you can customize them for more advanced scenarios, especially when building custom MetricReader implementations. This aggregation happens before data is exported.

The next concept you’ll likely grapple with is how to manage these metric instruments across different services and how to define consistent naming conventions and attribute cardinalities to avoid overwhelming your telemetry backend.

Want structured learning?

Take the full Opentelemetry course →