Recording rules in Prometheus let you pre-compute the results of expensive or commonly used queries and store them as new time series.
Let’s see this in action. Imagine you’re monitoring a fleet of web servers and frequently need to know the request rate per second for each server. A query like rate(http_requests_total[5m]) run every minute can be resource-intensive if you have thousands of servers. Instead, you can create a recording rule.
Here’s how you’d define it in your Prometheus configuration file (e.g., prometheus.yml):
rule_files:
- 'rules/recording_rules.yml'
And in rules/recording_rules.yml:
groups:
- name: http_rules
rules:
- record: http:requests:rate5m
expr: rate(http_requests_total[5m])
After reloading Prometheus’s configuration, you’ll see a new metric http:requests:rate5m appear in your Prometheus UI’s "Graph" tab. Querying http:requests:rate5m will now return the same data as rate(http_requests_total[5m]), but Prometheus is doing the heavy lifting in the background. This new metric is an actual time series stored in Prometheus, just like your raw metrics.
The problem this solves is performance and efficiency. When you have complex queries involving aggregations and rate calculations over a large number of time series, running these queries repeatedly can put a significant load on your Prometheus server’s CPU and memory. By recording the result, you offload that computation. Prometheus evaluates the expr expression periodically (based on its evaluation_interval, typically 15 or 30 seconds) and stores the result as a new, distinct time series. This means when you query http:requests:rate5m, Prometheus is just reading pre-computed data, which is much faster and less resource-intensive than re-evaluating rate(http_requests_total[5m]) on the fly.
The key levers you control are the record and expr fields within a rule. record defines the name of the new time series that will be created. It’s conventional to use a colon-separated format to indicate the aggregation and the metric it’s derived from, like http:requests:rate5m. The expr field contains the PromQL expression whose results will be recorded. You can use any valid PromQL query here, including aggregations (sum, avg, count), vector matching, and function calls.
The evaluation of recording rules happens independently of scraping. Prometheus evaluates all rules in its evaluation_interval. If a rule’s expression returns an empty result, no new time series are generated for that evaluation period. This is important because it means recording rules don’t bloat your storage with metrics that have no data. You can also use unless or unless_each in your expressions to control how recording rules behave when certain labels are absent, preventing the creation of unwanted aggregated series.
When you create a recording rule, Prometheus doesn’t just store the result; it stores the result with all the labels from the expression’s output. This means if rate(http_requests_total[5m]) returns results for instance="server1" and instance="server2", the recorded metric http:requests:rate5m will also have those same instance labels. However, if your expr includes an aggregation like sum by (job) (rate(http_requests_total[5m])), the resulting time series will only have the job label, and the original instance labels will be lost for that recorded metric.
The next concept you’ll likely encounter is alerting rules, which leverage the same rule definition structure but trigger alerts based on PromQL expressions.