The OpenTelemetry data model’s true power isn’t in collecting data, but in its ability to represent intent across vastly different telemetry types.
Let’s look at a typical distributed tracing scenario. Imagine a user request hitting your web service. This request isn’t just a single event; it’s a journey.
// Example Trace Span for an incoming HTTP request
{
"traceId": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
"spanId": "f0e9d8c7b6a54321",
"parentSpanId": "1122334455667788", // If this is a child span
"name": "HTTP GET /users/{id}",
"kind": "SERVER", // Indicates this span represents a server-side operation
"startTimeUnixNano": 1678886400123456789,
"endTimeUnixNano": 1678886400987654321,
"attributes": {
"http.method": "GET",
"http.url": "/users/123",
"http.status_code": 200,
"net.peer.ip": "192.168.1.100"
},
"events": [
{
"name": "request_received",
"timeUnixNano": 1678886400123456789,
"attributes": {
"http.request.body.size": 512
}
},
{
"name": "response_sent",
"timeUnixNano": 1678886400987654321,
"attributes": {
"http.response.body.size": 1024
}
}
],
"status": {
"code": 0 // 0 for OK, 1 for Error, 2 for Unset
}
}
Now, consider a log entry generated by that same web service during that request. Instead of being a standalone event, it’s now contextually linked.
// Example Log Record linked to the above Trace Span
{
"timeUnixNano": 1678886400500123456,
"severityText": "INFO",
"severityNumber": 8, // Corresponds to INFO
"body": {
"stringValue": "Processing user request"
},
"attributes": {
"user.id": "user-456",
"http.request.id": "a1b2c3d4e5f67890a1b2c3d4e5f67890" // Trace ID
},
"traceId": "a1b2c3d4e5f67890a1b2c3d4e5f67890", // Explicitly linked Trace ID
"spanId": "f0e9d8c7b6a54321" // Explicitly linked Span ID
}
And a metric, say, the duration of that specific request.
// Example Metric (Histogram) for HTTP request duration
{
"name": "http.server.request_duration",
"description": "Measures the duration of HTTP server requests.",
"unit": "s", // Seconds
"data": {
"type": "HISTOGRAM",
"dataPoints": [
{
"startTimeUnixNano": 1678886400000000000,
"timeUnixNano": 1678886400999999999,
"attributes": {
"http.method": "GET",
"http.route": "/users/{id}",
"http.status_code": 200
},
"count": 1, // This histogram bucket captured 1 request
"sum": 0.864234567, // The total duration of requests in this bucket
"bucketCounts": [
// Example: 0 requests in (0, 0.1], 0 in (0.1, 0.2], ..., 1 in (0.8, 1.6]
0, 0, 0, 0, 0, 0, 0, 0, 1, 0
],
"explicitBounds": [
0.1, 0.2, 0.4, 0.8, 1.6, 3.2, 6.4, 12.8, 25.6, 51.2
],
"traceId": "a1b2c3d4e5f67890a1b2c3d4e5f67890", // Can be associated with a trace
"spanId": "f0e9d8c7b6a54321" // Can be associated with a span
}
]
}
}
The magic is in the traceId, spanId, and attributes. A single trace ID stitches together all spans, logs, and metrics originating from that specific user request. A span ID links a log or metric to a particular operation within that request’s lifecycle. Attributes provide rich, structured context that can be filtered, aggregated, and correlated across all signal types. This unified representation means you’re not just collecting traces or logs or metrics; you’re collecting observability data that can be viewed and analyzed through any of these lenses, or even combined.
The core problem OpenTelemetry’s data model solves is the fragmentation of telemetry data. Before, traces were one system, logs another, and metrics a third. Each had its own format, its own context propagation, its own collection agents. You’d end up with isolated islands of information. If a request failed, you’d look at the trace to see where it hung, then hope the logs for that specific service and time window contained the error, and then try to correlate metrics for that service during that period. OpenTelemetry’s unified model means all that context is embedded and linked by default, making the traceId the universal key.
Internally, the OpenTelemetry SDKs build these data structures. When an HTTP request comes in, a Span is created with a traceId. As the service processes it, it might emit a log. The SDK, if configured correctly, automatically injects the current traceId and spanId into that log record. Similarly, metrics can be associated with the active span. This association is often achieved through a Context object managed by the SDK, which carries the current trace and span information. You don’t manually pass traceId and spanId to every logging or metric call; the SDK handles it.
The most surprising thing is how much of the "correlation" is simply a matter of consistent attribute naming and the presence of traceId and spanId on all relevant data points. You don’t need a complex, custom correlation engine in your backend; you need to ensure your instrumentation is correctly propagating and attaching these identifiers. For example, an attribute like http.status_code on a span can be directly compared to an http.status_code attribute on a log record originating from the same trace to understand why a particular status code was returned for a specific operation.
The next frontier is understanding how to effectively query and visualize this unified data, moving beyond simple trace views to complex causal analysis across signals.