The OpenTelemetry Collector failed to process telemetry data because a pipeline component rejected or dropped the data, preventing it from reaching its intended exporter.

Pipeline Configuration Errors

Diagnosis: Check the collector’s configuration file for syntax errors or incorrect component definitions. A common mistake is a typo in a component name or a missing required field.

Fix: Use a configuration linter or validator. For example, if using YAML, yamllint . can catch syntax issues. Ensure all component names match the official documentation and that required parameters like endpoint for receivers or format for processors are present and correctly formatted.

Why it works: The collector parses its configuration file at startup. Any malformed sections prevent the corresponding component from being initialized, leading to pipeline failures.

Receiver Not Accepting Data

Diagnosis: The receiver component might be misconfigured, not listening on the expected port, or not supporting the protocol/format of the incoming data.

Fix:

  1. Check Port/Protocol: For an OTLP receiver, ensure protocols are enabled (e.g., http and grpc) and that the port is correct in the receivers section. For Prometheus, verify the config block correctly points to the scrape_configs.
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    
  2. Firewall/Network: Confirm no firewall is blocking traffic to the collector’s listening port. Use sudo ufw status or sudo iptables -L to check firewall rules.
  3. Data Format: If using a specific format (e.g., Jaeger, Prometheus), ensure the receiver is configured for it and the incoming data adheres to that format.

Why it works: The receiver is the entry point. If it cannot bind to its port or understand the incoming data stream, no data will enter the pipeline, causing upstream components to appear to be failing or dropping data.

Processor Dropping Data

Diagnosis: Processors like batch, memory_limiter, or tail_sampling can be configured to drop data based on various criteria. The batch processor might be configured with an excessively small timeout or send_batch_size. The memory_limiter might be set too aggressively.

Fix:

  1. Batch Processor: Increase send_batch_size and timeout in the processors section.
    processors:
      batch:
        send_batch_size: 1000
        timeout: 1s
    
  2. Memory Limiter: Adjust memory_limit_mib and spike_limit_mib to be more generous if the collector is experiencing memory pressure.
    processors:
      memory_limiter:
        check_interval: 1s
        limit_mib: 1000
        spike_limit_mib: 100
    
  3. Tail Sampling: Review decision_wait and num_strategies if tail-based sampling is enabled. A small decision_wait can cause samples to be dropped before decisions are made.

Why it works: Processors modify or filter data. If their internal thresholds or rules cause them to discard data (e.g., to manage memory or enforce sampling policies), that data is lost.

Exporter Failing to Send Data

Diagnosis: The exporter might be misconfigured with an incorrect endpoint, failing authentication, or the destination service might be unavailable or overloaded.

Fix:

  1. Endpoint: Verify the endpoint URL in the exporters section is correct and accessible from the collector.
    exporters:
      otlp:
        endpoint: "http://your-backend:4318"
    
  2. Authentication: If the exporter requires authentication (e.g., API keys, mTLS), ensure credentials are valid and correctly configured.
  3. Network/Firewall: Check if the collector can reach the exporter’s endpoint. Use curl -v <exporter_endpoint> from the collector’s host.
  4. Destination Service: The backend service might be down, overloaded, or returning errors (e.g., 5xx status codes). Check the backend’s logs.

Why it works: The exporter is the final leg. If it cannot establish a connection or successfully transmit data to the backend, the data is lost.

Incorrect Pipeline Routing

Diagnosis: The service.pipelines section might not correctly connect receivers, processors, and exporters, or it might be missing a crucial link. Data is sent to a pipeline, but no exporter is attached to that pipeline.

Fix: Ensure each receiver is associated with at least one pipeline, and each pipeline has at least one exporter. Processors can be optionally included.

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, memory_limiter]
      exporters: [otlphttp]
    metrics:
      receivers: [otlp, prometheus]
      processors: [batch]
      exporters: [logging]

Why it works: The service.pipelines block is the router. If a receiver is configured but not attached to a pipeline that ultimately leads to an exporter, the data it receives will go nowhere.

Collector Resource Exhaustion

Diagnosis: The collector itself might be running out of CPU, memory, or file descriptors, causing components to become unresponsive or crash.

Fix:

  1. Memory: Monitor memory usage (top, htop, docker stats). If consistently high, increase the collector’s allocated memory or tune the memory_limiter processor.
  2. CPU: Monitor CPU usage. If high, consider distributing the load across multiple collector instances or optimizing processors.
  3. File Descriptors: Check ulimit -n. If low, increase the limit for the user running the collector.
    # Temporarily increase for current session
    ulimit -n 65536
    # For persistent change, edit /etc/security/limits.conf
    

Why it works: When the operating system cannot provide sufficient resources, processes become unstable. File descriptor limits are particularly common for network-intensive applications like the collector.

The next error you’ll likely encounter after fixing pipeline issues is related to the backend system’s capacity to ingest or process the now-arriving data.

Want structured learning?

Take the full Opentelemetry course →