Brotli, despite being newer, is often slower to decompress than Gzip, which might seem counterintuitive given its superior compression ratios.
Let’s see how these two play out in a real-world scenario. Imagine you’re serving a static HTML file, say index.html, which is 100KB uncompressed.
Here’s how you might compress it using gzip:
gzip -k -9 index.html
This creates index.html.gz. The file size might shrink to around 20KB.
Now, let’s do the same with brotli:
brotli -k -q 11 index.html
This creates index.html.br. The file size might shrink even further, perhaps to 15KB. Notice the -q 11 for maximum compression.
When a browser requests this file, the web server needs to serve the compressed version. If the browser supports Brotli (most modern ones do), the server might send index.html.br. The browser then decompresses it. This decompression step is where the performance difference often appears. While Brotli achieved a smaller file, its decompression algorithm can be more computationally intensive than Gzip’s.
The primary problem Brotli solves is delivering assets over the network more efficiently. By achieving higher compression ratios than Gzip, it reduces the amount of data that needs to be transferred. This leads to faster page load times, especially on slower networks or for users with data caps. It’s a win for bandwidth and perceived performance.
Internally, Brotli uses a combination of techniques. It employs a static dictionary of common words and phrases (similar to Gzip’s LZ77, but larger and more specialized for web content) alongside a dynamic context-modeled entropy coder (using static Huffman coding and a more advanced arithmetic coder). The larger, pre-defined dictionary is a key reason for its better compression. It also uses a sliding window for LZ77-style matching, but the way it encodes the matches and literal characters is where much of its efficiency comes from.
The levers you control are primarily the compression level and the choice of algorithm. For web servers, you configure them to serve .br files when the Accept-Encoding: br header is present in the request, and .gz files if Accept-Encoding: gzip is present but br is not. If neither is present, you serve the uncompressed file. The compression levels (e.g., -q for Brotli, -1 to -9 for Gzip) directly impact the trade-off between compression ratio and the time it takes to compress. Higher levels mean better compression but longer compression times.
What most people don’t realize is that Brotli’s higher compression ratio isn’t solely due to a more advanced dictionary or entropy coder. A significant factor is its use of a static, UTF-8 based dictionary that contains over 130,000 words and common substrings. This dictionary is derived from a large corpus of web pages and is highly optimized for web content. When Brotli encounters these common sequences, it can represent them with very few bits, leading to exceptional compression. Gzip’s dictionary is much smaller and less specialized.
The next major consideration is cache invalidation strategies for these compressed assets.