OpenTelemetry Python lets you trace everything by default, but you often need to tune it to see what you actually care about.

Let’s see it in action. Imagine a simple Flask app:

from flask import Flask, request, jsonify
import requests
import os

app = Flask(__name__)

@app.route("/api/v1/users/<int:user_id>")
def get_user(user_id):
    # Simulate a database call
    user_data = {"id": user_id, "name": f"User {user_id}"}
    return jsonify(user_data)

@app.route("/api/v1/process")
def process_request():
    external_service_url = os.environ.get("EXTERNAL_SERVICE_URL", "http://localhost:5001/data")
    try:
        response = requests.get(external_service_url, timeout=5)
        response.raise_for_status() # Raise an exception for bad status codes
        data_from_external = response.json()
    except requests.exceptions.RequestException as e:
        return jsonify({"error": f"External service error: {e}"}), 500

    # Simulate some processing
    processed_data = {"original": data_from_external, "status": "processed"}
    return jsonify(processed_data)

if __name__ == "__main__":
    app.run(debug=True, port=5000)

And a simple external service it calls:

from flask import Flask, jsonify
import time

app = Flask(__name__)

@app.route("/data")
def get_data():
    time.sleep(0.5) # Simulate work
    return jsonify({"message": "Data from external service", "timestamp": time.time()})

if __name__ == "__main__":
    app.run(debug=True, port=5001)

To instrument these with OpenTelemetry, you’ll install the necessary packages:

pip install opentelemetry-api opentelemetry-sdk opentelemetry-instrumentation-flask opentelemetry-instrumentation-requests

Then, you initialize the tracer in your main application file (e.g., app.py for Flask):

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.resources import Resource
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry_instrumentation_flask import FlaskInstrumentor
from opentelemetry_instrumentation_requests import RequestsInstrumentor
import os

# Initialize Tracer Provider
resource = Resource(attributes={
    "service.name": "my-flask-app",
    "service.instance.id": os.environ.get("HOSTNAME", "localhost")
})
provider = TracerProvider(resource=resource)

# Configure exporter (e.g., OTLP to a collector)
# Ensure your collector is running and accessible at this address
otlp_exporter = OTLPSpanExporter(endpoint="localhost:4317", insecure=True)
provider.add_span_processor(BatchSpanProcessor(otlp_exporter))

# Set the global tracer provider
trace.set_tracer_provider(provider)

# Get a tracer instance
tracer = trace.get_tracer(__name__)

# Instrument Flask and Requests
FlaskInstrumentor().instrument_app(app)
RequestsInstrumentor().instrument()

# ... rest of your Flask app code (routes, etc.)

Now, when you run both app.py and the external service, and then hit /api/v1/process in your Flask app (which in turn calls the external service), you’ll see traces in your OpenTelemetry collector.

The FlaskInstrumentor automatically creates spans for incoming HTTP requests to your Flask app. Each span will include details like the HTTP method, URL path, status code, and request duration.

The RequestsInstrumentor automatically creates spans for outgoing HTTP requests made by the requests library. This is crucial for understanding latency in distributed systems. You’ll see spans for GET http://localhost:5001/data, including its duration.

The magic here is how these spans form a parent-child relationship. The span for the incoming Flask request becomes the parent span, and the span for the outgoing requests call becomes a child span. This allows you to visualize the entire request flow and identify bottlenecks.

The Resource object is your way of tagging telemetry data with metadata about the service producing it. service.name is fundamental for grouping traces, and service.instance.id helps distinguish between multiple instances of the same service.

The BatchSpanProcessor is an efficient way to send spans. Instead of sending each span as it’s created, it buffers them and sends them in batches, reducing network overhead.

When you call requests.get(external_service_url, timeout=5), the RequestsInstrumentor intercepts this call. It creates a span representing this outgoing HTTP request. If the request succeeds, the span is marked as finished with appropriate attributes. If it fails (e.g., timeout, connection error, bad status code), the span’s status is set to an error, and the exception details are captured. This is how you get visibility into downstream failures.

The FlaskInstrumentor hooks into the Flask request lifecycle. When a request comes in, it starts a span. When the request handler finishes (or an exception is raised), it ends the span. By default, it captures attributes like http.method, http.route, http.status_code, and net.peer.ip.

The opentelemetry-instrumentation-flask and opentelemetry-instrumentation-requests packages automatically inject and propagate trace context. When your Flask app makes a requests call, the trace context (including trace ID and span ID) from the incoming request is automatically added to the outgoing HTTP headers. The receiving service (if also instrumented) can then pick this up and continue the trace.

The real power comes from combining these automatic instrumentations with custom spans. You might want to add spans around specific business logic:

@app.route("/api/v1/process")
def process_request():
    # ... (previous code) ...

    # Custom span for business logic
    with tracer.start_as_current_span("data_processing_logic") as span:
        span.set_attribute("processing.items", len(data_from_external.get("items", [])))
        # Simulate some processing
        processed_data = {"original": data_from_external, "status": "processed"}
        time.sleep(0.2) # Simulate more work
        span.add_event("data_processed_successfully")

    return jsonify(processed_data)

This tracer.start_as_current_span block creates a span named "data_processing_logic" that becomes the current span in the context. Any further spans created within this with block will automatically be its children. You can add attributes and events to this span to provide more granular insights into your application’s behavior.

The most surprising thing about OpenTelemetry’s automatic instrumentation is how it handles context propagation. It doesn’t just send trace IDs; it actively injects them into outgoing requests and parses them from incoming ones, seamlessly stitching together distributed traces across different services and even different programming languages, as long as they adhere to the W3C Trace Context specification.

The next concept you’ll likely explore is configuring sampling. By default, the BatchSpanProcessor might sample every trace. For high-traffic applications, this can generate a lot of data. You’ll want to implement a sampling strategy, like a TraceIdRatioSampler or a ParentBasedSampler, to control which traces are exported.

Want structured learning?

Take the full Opentelemetry course →