Prometheus doesn’t actually discover Kubernetes pods; it discovers Kubernetes Services and relies on those Services to tell it which pods are ready to serve traffic.
Here’s a simple Nginx deployment and service, and how Prometheus will find it:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
labels:
app: nginx
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
When Prometheus is running inside your Kubernetes cluster and configured with the kubernetes_sd_config and kubernetes relabeling, it queries the Kubernetes API server. It watches for changes to Service objects. When it sees the nginx-service above, it gets a list of pods that match the selector (app: nginx). For each of those pods, it creates a target with labels like __meta_kubernetes_service_name: nginx-service, __meta_kubernetes_pod_name: nginx-deployment-xxxx, and crucially, __address__: <pod-ip>:<service-port>.
The default kubernetes relabeling rule then takes __meta_kubernetes_service_name and __meta_kubernetes_endpoint_port_name (if applicable) to construct the final job label, and it uses __address__ as the scrape target. The result is Prometheus scraping each of the nginx-deployment-xxxx pods on port 80, with a job name derived from the service.
The most surprising true thing about Prometheus Kubernetes Service Discovery is that it doesn’t directly discover pods at all; it discovers Kubernetes Services and then uses those Services to find the pods. Prometheus relies on the selector field within a Service definition to identify which pods are associated with that Service. Only pods that are selected by a Service will be considered for scraping by Prometheus when using the kubernetes_sd_configs module with the service role.
Let’s see this in action with a Prometheus configuration snippet. Assume Prometheus is running in-cluster with a Service Account that has list and watch permissions for services, endpoints, pods, and nodes.
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
# Keep only pods that have Prometheus annotations for scraping
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
# Use pod IP as the scrape address
- source_labels: [__address__]
action: replace
target_label: __address__
regex: ([^:]+)(:[0-9]+)?
replacement: ${1}:9090 # Assuming your pod metrics are on port 9090
# Set the job name from the pod's namespace and app label
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_label_app]
action: replace
target_label: job
regex: (.*?);(.*)
replacement: ${1}/${2}
# Set the instance label to the pod name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: instance
In this configuration:
job_name: 'kubernetes-pods'is a descriptive name for this scrape job.kubernetes_sd_configswithrole: podtells Prometheus to watch forPodobjects in Kubernetes.- The
relabel_configssection is where the magic happens to filter and transform the discovered targets.- The first rule
keeps only those pods that have the annotationprometheus.io/scrape: "true". This is how you opt-in specific pods for Prometheus to scrape. - The second rule replaces the
__address__meta-label. It extracts the pod’s IP address and sets the port to9090. This assumes your application inside the pod exposes metrics on port9090and that this port is specified in theprometheus.io/portannotation (if used, otherwise__address__might default to the first container port). - The third rule constructs the
joblabel. It concatenates the Kubernetesnamespaceand theapplabel from the pod, separated by a slash. So a pod with labelapp: my-appin namespaceproductionwould get a job name likeproduction/my-app. - The fourth rule sets the
instancelabel to the name of the pod (__meta_kubernetes_pod_name).
- The first rule
Consider a pod with these labels and annotations:
apiVersion: v1
kind: Pod
metadata:
name: my-app-pod-12345
namespace: default
labels:
app: my-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics" # Optional, defaults to /metrics
spec:
containers:
- name: my-app-container
image: my-app-image
ports:
- containerPort: 8080 # This port is ignored if prometheus.io/port is set
- containerPort: 9090 # This is the port Prometheus will use
With the Prometheus configuration above, this pod would be discovered as a target with:
__address__:<pod-ip>:9090job:default/my-appinstance:my-app-pod-12345__meta_kubernetes_pod_annotation_prometheus_io_scrape:true__meta_kubernetes_pod_annotation_prometheus_io_port:9090
The kubernetes_sd_configs with role: pod is powerful because it allows Prometheus to discover and scrape metrics directly from individual pods without needing to define a Kubernetes Service for every single application that exposes metrics. This is particularly useful for ephemeral workloads or services that don’t require a stable external endpoint.
A common pitfall is forgetting to add the prometheus.io/scrape: "true" annotation to your pods. Without it, Prometheus will discover the pod but the relabeling rules will filter it out, meaning it will never be scraped. Another common issue is a mismatch between the port specified in prometheus.io/port and the actual port your application is listening on for metrics.
The next concept you’ll likely encounter is how to scrape metrics from nodes themselves, using role: node in your kubernetes_sd_configs, and how to integrate that with other discovery roles like role: service and role: endpoints.