Brotli is often hailed as the successor to Gzip, but its real magic lies in its ability to achieve significantly smaller file sizes for text-based assets, even at comparable compression speeds.
Let’s see Brotli in action. Imagine you have a small HTML file:
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Example Page</title>
</head>
<body>
<h1>Hello, Brotli!</h1>
<p>This is a sample paragraph to demonstrate compression.</p>
</body>
</html>
Using Gzip at its highest compression level (-9), we get:
gzip -c -9 index.html > index.html.gz
ls -lh index.html index.html.gz
Output:
-rw-r--r-- 1 user user 210 Mar 10 10:30 index.html
-rw-r--r-- 1 user user 178 Mar 10 10:30 index.html.gz
The Gzip compressed file is 178 bytes.
Now, let’s compress the same file with Brotli at its highest quality level (-11):
brotli -c -q 11 index.html > index.html.br
ls -lh index.html index.html.br
Output:
-rw-r--r-- 1 user user 210 Mar 10 10:30 index.html
-rw-r--r-- 1 user user 161 Mar 10 10:30 index.html.br
The Brotli compressed file is 161 bytes. A noticeable saving, especially when multiplied across thousands of assets.
The core problem Brotli solves is the ever-increasing demand for bandwidth and faster load times in web applications. As web pages become richer with dynamic content, JavaScript, and CSS, the size of assets can balloon. Traditional compression algorithms like Gzip, while effective, have limitations on how much they can shrink repetitive text patterns. Brotli, developed by Google, was designed to overcome these limitations by leveraging more advanced compression techniques.
Internally, Brotli employs several strategies that differentiate it from Gzip. It uses a static dictionary derived from common web content, meaning it has pre-defined representations for frequently occurring words and phrases. This is similar in concept to Gzip’s LZ77, but Brotli’s dictionary is much larger and more tailored. Crucially, Brotli also incorporates a context modeling transformation (CMUCL) and a Huffman coding stage. The CMUCL analyzes the data contextually, predicting the next symbol based on preceding ones, and then uses a more efficient encoding. This multi-stage approach allows it to identify and exploit redundancies that Gzip might miss.
The primary lever you control with Brotli is its quality setting, ranging from -q 0 (fastest, least compression) to -q 11 (slowest, best compression). For web servers, this translates directly to a trade-off between CPU usage during compression (usually done once by the server or build process) and the amount of data transferred to the client. Most web servers and CDNs will offer a default quality setting around -q 4 to -q 6, which provides a good balance. For static assets that are compressed offline and served repeatedly, -q 11 is often the preferred choice.
What most people don’t realize is that Brotli’s effectiveness isn’t uniform across all data types. While it shines for text-based assets like HTML, CSS, JavaScript, and JSON, its performance advantage over Gzip diminishes significantly, and can even reverse, for binary files like images, audio, and video. These file types are often already compressed using specialized algorithms (e.g., JPEG for images, MP3 for audio), meaning there’s less redundancy for Brotli to exploit. In such cases, attempting to Brotli-compress already compressed binary data can sometimes lead to larger files or negligible savings, negating the benefit.
The next step in optimizing web asset delivery often involves exploring HTTP/2 or HTTP/3, which introduce multiplexing and other performance enhancements that work synergistically with content compression.