The rate() function in Prometheus doesn’t actually "handle" counter resets; it’s designed to mathematically compensate for them implicitly.
Let’s see this in action. Imagine a simple counter metric http_requests_total that increments with every incoming HTTP request.
http_requests_total{job="my-app", handler="/health"}
If this metric’s value is 100 at time t1 and 105 at time t2 (a few seconds later), the rate of increase is (105 - 100) / (t2 - t1). Pretty straightforward.
But what if the my-app service restarts between t1 and t2? The counter might reset to 0 and then increment to 5 by time t2. Naively calculating (5 - 100) / (t2 - t1) would give a nonsensical negative rate. This is where rate() shines.
When rate(http_requests_total[5m]) is evaluated, Prometheus looks at the time series within the 5m window. It identifies any "jumps" where the counter value decreases. These are assumed to be resets. The function then calculates the increase from the last valid point before the reset to the end of the window, plus the increase from the beginning of the window to the point after the reset. It effectively stitches together the counter’s value across the reset, giving you the true rate of increase as if the reset never happened.
The core of rate()'s magic lies in its lookback window and how it interprets the data points. It samples data points within the specified range (e.g., [5m]). If it sees a sequence like ... 100 (t1), 105 (t2), 0 (t3 - reset), 5 (t4) ..., and you query rate(http_requests_total[5m]) where t4 is within the 5m window ending now, it will:
- See the jump from
105to0att3as a reset. - Calculate the rate from
t2tot4based on the value att2(105) and the value att4(5), effectively treating the reset as a wrap-around. - If
t1is also within the5mwindow, it will include the increase fromt1tot2as well.
The rate() function also automatically scales the result to "per second." So, if the counter increased by 10 in 20 seconds, rate() will report 0.5 (10 requests / 20 seconds). The increase() function is similar but returns the total increase over the period, not per second, and also handles resets.
The rate() function, by default, only considers counter resets that cause a decrease in value. It assumes positive increases are always valid. The calculation is performed on the raw samples within the chosen time window. If a counter resets, Prometheus doesn’t store a negative number; it stores the new, lower value. rate() detects this drop and extrapolates.
If you’re seeing unexpected dips in your rate() graphs, it’s almost certainly due to a service restart or a metric scrape that happened to fall immediately after a counter reset. The underlying Prometheus server is robust to this; it’s the interpretation of the data that matters.
The next conceptual hurdle is understanding how avg_over_time() differs from rate() when dealing with these same counter resets.