OpenTelemetry spans are like snapshots of work happening within your application, and attributes are the crucial details that make those snapshots useful.
Let’s see this in action. Imagine a simple web request. Here’s what a span might look like with attributes:
{
"traceId": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
"spanId": "f67890a1b2c3d4e5",
"name": "GET /users/{id}",
"kind": "SERVER",
"startTimeUnixNano": 1678886400000000000,
"endTimeUnixNano": 1678886400100000000,
"attributes": {
"http.method": "GET",
"http.url": "/users/123",
"http.status_code": 200,
"user.id": "user-123",
"db.system": "postgresql",
"db.statement": "SELECT * FROM users WHERE id = $1",
"db.statement_parameters": "[\"123\"]"
},
"status": {
"code": 0
}
}
This span, representing a server-side operation, tells us it handled a GET request to /users/123, returned a 200 status, and was specifically for user-123. It also shows the underlying database interaction: a SELECT statement on a postgresql database, with the parameter 123 passed. Without these attributes, the span name ("GET /users/{id}") is generic; with them, we know which user, which ID, and what database query was involved.
The problem OpenTelemetry spans solve is distributed tracing: understanding the flow of a request across multiple services. Before distributed tracing, you’d have logs scattered across different machines, making it a nightmare to correlate events and pinpoint bottlenecks. Spans, with their attributes, provide a structured way to tie these events together. A span represents a single unit of work (like an HTTP request, a database query, or a function call), and its attributes are key-value pairs that describe that work.
Here’s how it works internally: an OpenTelemetry SDK in your application captures these spans. When a service receives a request, it starts a root span. If it then calls another service or performs a database operation, it creates child spans. These child spans are linked to the parent span via traceId and spanId. The attributes are added to these spans as they are created or as operations complete. Finally, these spans are exported to a backend (like Jaeger, Zipkin, or a cloud provider’s observability tool) where they are assembled into a complete trace, visually showing the entire journey of a request.
The key levers you control are the types of attributes you add and the values you assign. You decide what information is important for debugging and performance analysis. This includes:
- HTTP attributes:
http.method,http.url,http.status_code,http.route(e.g.,/users/:idinstead of/users/123) are essential for understanding web service behavior. - Database attributes:
db.system(e.g.,postgresql,mysql),db.statement,db.statement_parameters,db.userhelp diagnose database performance issues. - RPC attributes: For inter-service communication,
rpc.method,rpc.service,rpc.status_codeare vital. - Error attributes:
error.type,error.message,error.stacktraceare critical for pinpointing failures. - Custom attributes:
user.id,tenant.id,order.id,product.skuallow you to slice and dice traces by specific business entities or users.
When choosing attributes, think about what questions you’d ask if a request was slow or failed. "Which user was affected?" "What was the exact database query?" "What was the error message?" These questions guide your attribute selection. Standard semantic conventions (like those defined by OpenTelemetry) are highly recommended because they ensure compatibility with observability backends and make your traces understandable to others.
The most impactful attributes are often those that represent the context of the operation, not just the operation itself. For instance, http.url might show /users/123, but adding user.id: "user-123" to the same span provides a direct link to the specific user, enabling faster filtering and root cause analysis when multiple users experience issues with similar URLs. This isn’t about logging every single piece of data, but about attaching the meaningful identifiers that allow you to pivot from a general observation to a specific instance.
Beyond basic request/response details, consider adding attributes that capture the intent or identity of the operation. For example, if a span represents processing an order, adding order.id is more powerful than just operation.type: "process_order". This allows you to trace the lifecycle of a specific order across all its constituent operations.
The next logical step after effectively tagging your spans is understanding how to propagate context across asynchronous operations.