A covering index is often overkill, but when it’s right, it’s so right that it can make a query feel like it’s magically fast.

Let’s look at a common scenario: fetching a few specific columns from a large table for reporting. Imagine a users table with millions of rows, and you need id and email for all users who signed up in the last month.

-- users table (simplified)
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Sample data (millions of rows)
INSERT INTO users (username, email, created_at) VALUES
('alice', 'alice@example.com', NOW() - INTERVAL 5 DAY),
('bob', 'bob@example.com', NOW() - INTERVAL 10 DAY),
('charlie', 'charlie@example.com', NOW() - INTERVAL 35 DAY); -- And many more...

A naive query might look like this:

SELECT id, email
FROM users
WHERE created_at >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH);

If you just have a regular index on created_at (e.g., CREATE INDEX idx_created_at ON users(created_at);), the database will:

  1. Use idx_created_at to find all the rows matching the WHERE clause.
  2. For each of those rows, it will then have to go back to the main users table (the "heap" or "clustered index") to fetch the id and email columns. This second trip to the table is called a "table lookup" or "bookmark lookup."

This is where a covering index shines. A covering index is a B-tree index that includes all the columns required by a query: both the columns in the WHERE clause and the columns in the SELECT list.

Let’s create a covering index for our query:

CREATE INDEX idx_users_covering ON users(created_at, id, email);

Now, when the query SELECT id, email FROM users WHERE created_at >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH); runs, the database can do something amazing. It finds the relevant rows using the created_at part of the index, and because id and email are also in the same index leaf node, it can retrieve all the necessary data directly from the index itself. It doesn’t need to touch the main table data at all. This eliminates the table lookups, often leading to a dramatic performance improvement for queries that can be fully satisfied by an index.

The key is that the index structure, in this case, contains everything the query needs. The database reads the index, and it’s done. No need to navigate to the main table data pages.

The mental model is that the index isn’t just a pointer to rows; it can be the data for specific queries. Think of it like a specialized, pre-sorted, and abridged phone book that has only the names and numbers you’re looking for, sorted by neighborhood. If you need names and numbers in a specific neighborhood, you just flip through that section of the specialized phone book – you don’t need to go back to the main, full phone book to look up each number.

The most surprising thing about covering indexes is how often they are misunderstood as just "an index with extra columns." They aren’t just extra; the order and completeness of the columns in the index are what enable the "covering" behavior. The database engine specifically checks if all requested columns are present in the index’s leaf nodes. If even one column is missing, it’s not a covering index for that query, and table lookups might still occur.

You’ve successfully designed a covering index. The next step is to consider how to maintain it.

Want structured learning?

Take the full Performance course →