Prometheus’s time-series database (TSDB) doesn’t just store data; it actively reorganizes and discards it to maintain performance and manage disk space.

Let’s see how this plays out with a real Prometheus setup. Imagine Prometheus is scraping metrics from a few hundred targets, collecting millions of data points per minute.

# prometheus.yml

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'node_exporter'
    static_configs:
      - targets: ['localhost:9100']

# The following are the key configurations for TSDB compaction and retention
storage:
  tsdb:
    # How long to keep raw data before it's compacted.
    # Default is 2 hours.
    # Setting this to 168h (7 days) means raw data is kept for a week.
    retention: 168h

    # How often to run the compaction process.
    # Default is 1h.
    # Setting this to 2h means compaction runs every two hours.
    compaction:
      # Blocks of raw data are compacted into larger blocks.
      # This defines the size of the chunks within a block.
      # Default is 256MB.
      chunk: 512MB

      # This defines the maximum size of a block of data that can be compacted.
      # Default is 256MB.
      # Setting this to 1GB means blocks up to 1GB are eligible for compaction.
      max_block_size: 1GB

      # How frequently to check for and perform compactions.
      # Default is 10m.
      # Setting this to 30m means compaction checks happen every 30 minutes.
      recheck_disabled: false
      # recheck_interval: 30m # This is implicitly controlled by the compaction process itself.

The problem Prometheus solves is the exponential growth of time-series data. Storing raw data indefinitely would quickly consume all available disk space. Compaction and retention are Prometheus’s two primary mechanisms for managing this.

Internally, Prometheus stores data in blocks. Each block is a directory containing data files (chunks) and metadata. When Prometheus writes data, it creates new chunks within these blocks. Compaction is the process of merging smaller, older blocks into larger, more efficiently stored blocks. This reduces the number of files and improves query performance by allowing Prometheus to read data from fewer, larger sources. Retention is the policy for discarding data that is no longer needed, typically based on a time duration.

The retention setting in prometheus.yml directly controls how long raw data is kept before it’s eligible for deletion. If you set retention: 168h, Prometheus will keep raw data for seven days. After seven days, this data is considered "old" and can be deleted by the retention process.

Compaction, governed by the storage.tsdb.compaction section, is more about optimizing storage. The chunk and max_block_size parameters define how Prometheus groups and merges data. Prometheus looks for blocks that are smaller than max_block_size and whose chunks are smaller than chunk. It then merges these into a single, larger block. This process is crucial for reducing the overhead of managing many small files. The recheck_interval (implicitly managed, but can be tuned via recheck_disabled: false) determines how often Prometheus actively looks for compactions to perform.

The one thing most people don’t realize is that compaction doesn’t just happen on a schedule; it’s a continuous, background process. Prometheus is constantly evaluating its blocks, identifying candidates for merging, and performing these merges asynchronously. The recheck_interval is more about how frequently it re-evaluates its strategy, not a hard trigger for a compaction run. If there’s a lot of data being written, compaction might be happening more often than recheck_interval suggests, as Prometheus tries to keep up with incoming data and maintain its internal block structure.

If you’ve configured retention and compaction, the next thing you’ll likely encounter is optimizing query performance, which often involves understanding the impact of these settings on index size and query execution plans.

Want structured learning?

Take the full Prometheus course →