The OpenTelemetry Transform Processor is surprisingly just a tiny, embedded data pipeline that runs inside your OpenTelemetry Collector, not a separate service you deploy.

Let’s see it in action. Imagine you’ve got logs coming in, and you want to add a service.version attribute to all of them, but only if it’s not already there. Here’s how a simple transform processor config in your collector otel-collector-config.yaml would handle it:

receivers:
  otlp:
    protocols:
      grpc:
      http:

processors:
  transform:
    log:
      - set:
          key: service.version
          value: "v1.2.3"
          # Only set if the key doesn't exist
          if: |
            !IsSet(attributes["service.version"])

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [transform]
      exporters: [logging] # Or your actual exporter

exporters:
  logging:
    loglevel: debug

When you send a log like this:

{
  "resource": {
    "attributes": [
      {"key": "host.name", "value": {"stringValue": "my-host"}}
    ]
  },
  "logRecords": [
    {
      "body": {"stringValue": "User logged in"},
      "attributes": [
        {"key": "user.id", "value": {"stringValue": "alice"}}
      ]
    }
  ]
}

The transform processor intercepts it. It checks attributes["service.version"]. Since it’s not set, the if condition !IsSet(attributes["service.version"]) evaluates to true. The processor then adds "service.version": "v1.2.3" to the log’s attributes.

If you send a log that already has service.version, the if condition would be false, and the processor would do nothing, preserving the existing value.

This transform processor is designed to solve the problem of needing to standardize, enrich, or conditionally modify telemetry data before it leaves your collector, without needing complex external ETL tools or custom code. It operates directly on the telemetry data structures (logs, metrics, traces) as they flow through the collector’s pipeline. You can think of it as a mini-scripting engine for your telemetry.

It provides a rich expression language (based on the OpenTelemetry Collector’s internal pdata model) allowing you to:

  • Access data: attributes["key"], body(), span.name, metric.name, resource.attributes["key"].
  • Manipulate data: set(), delete(), update_values(), replace_all_matches().
  • Conditional logic: if:, when:.
  • String operations: Contains(), StartsWith(), Format().
  • Numeric operations: Add(), Subtract().
  • Boolean logic: And, Or, Not, IsSet(), IsEmpty().

The most surprising part is how powerful the Format() function is, especially for constructing new attribute values by interpolating existing ones. You can do things like Format("users/%s/events", attributes["user.id"]) to dynamically build attribute values based on the data itself, which is incredibly useful for creating consistent tagging schemes.

The next concept you’ll likely encounter is how to combine multiple transform processors or use them in conjunction with other processors like attributes or filter to build highly sophisticated data manipulation pipelines within a single collector instance.

Want structured learning?

Take the full Opentelemetry course →