OpenTelemetry network probes are surprisingly good at telling you if a host is reachable, but not why it’s not.
Let’s see it in action. Imagine you have a service running on 192.168.1.100 and you want to know if it’s up. You’d configure an OpenTelemetry collector like this:
receivers:
hostmetrics:
collection:
scrapers:
net:
# This is where the magic happens
network_connections:
- protocol: tcp
transport: client
address: 192.168.1.100:80
- protocol: icmp
address: 192.168.1.100
processors:
batch:
exporters:
logging:
loglevel: debug
service:
pipelines:
metrics:
receivers: [hostmetrics]
processors: [batch]
exporters: [logging]
When this collector runs, it will send ICMP echo requests (pings) to 192.168.1.100 and also attempt to establish TCP connections to port 80. The hostmetrics receiver, specifically the net scraper, is doing the heavy lifting here. It leverages the operating system’s networking stack to perform these checks. The ICMP check is a simple "is the host alive?" query. The TCP check is more granular: "is the host alive and is something listening on this specific port?"
The output in your collector’s logs will look something like this for a successful ping:
...
"net.icmp.ping.count": 1,
"net.icmp.ping.success": 1,
"net.tcp.connection.count": 1,
"net.tcp.connection.success": 1,
...
And for a failed ping (but successful TCP connection to another host):
...
"net.icmp.ping.count": 1,
"net.icmp.ping.success": 0, # ICMP failed
"net.tcp.connection.count": 1,
"net.tcp.connection.success": 1, # TCP to other host succeeded
...
The core problem this solves is proactive network path monitoring from the perspective of your collector. Instead of waiting for an application-level error to tell you a dependency is down, you can get an early warning that a host is no longer responding to basic network probes. This is crucial for distributed systems where many components rely on inter-service communication. You can monitor critical infrastructure, external APIs, or even other internal services.
The net scraper in hostmetrics is the key component. It’s designed to be lightweight and collect a variety of network-related metrics. For ICMP, it’s essentially wrapping the ping command or its underlying OS equivalent. For TCP connections, it’s attempting a socket connect() call. The protocol and address fields in the configuration are your primary levers. You can specify tcp, udp, or icmp for protocol. For TCP and UDP, you’ll also need a port. The address can be an IP address or a hostname.
The real power comes from correlating these metrics. If net.icmp.ping.success drops to 0, but net.tcp.connection.success remains 1 for a different host, you know the problem is specific to 192.168.1.100 and not a general network outage. If both fail, it points to a broader network issue.
What most people miss is that the hostmetrics receiver, when configured for network probes, doesn’t just passively observe. It actively initiates connections and sends packets. This means the metrics you see are a direct reflection of the collector’s own ability to reach the target, not just a general state of the network. If your collector is running in a tightly controlled network segment, these probes can reveal connectivity issues that might not be apparent from higher-level application logs. Furthermore, the net scraper can also monitor existing network connections (protocol: tcp, transport: server) which gives you visibility into incoming connections, but for active probing, you’ll focus on transport: client.
The next logical step after ensuring basic host reachability is to understand the performance characteristics of those connections.