HTTP/2 Performance Secrets: Beyond the Hype

HTTP/2 is faster than HTTP/1.1, but not because it’s "more efficient" in the way you’re probably thinking. The real magic is that it can send many requests and responses over a single TCP connection simultaneously, and it does this by breaking everything down into tiny, interleaved frames.

Let’s see it in action. Imagine a browser requesting an HTML page, a CSS file, and an image.

# On the server, using nghttpd (HTTP/2 server demo)
# In another terminal, using nghttp (HTTP/2 client demo)

# nghttp http://localhost:8080/index.html

# Server output might show:
# ...
# [I 1018/103045] H2EventFrame {stream_id=1, type=HEADERS, flags=END_HEADERS|END_STREAM, payload=...}
# [I 1018/103045] H2EventFrame {stream_id=1, type=DATA, flags=END_STREAM, payload=...} # HTML response
# [I 1018/103046] H2EventFrame {stream_id=3, type=HEADERS, flags=END_HEADERS, payload=...}
# [I 1018/103046] H2EventFrame {stream_id=3, type=DATA, flags=END_STREAM, payload=...} # CSS response
# [I 1018/103047] H2EventFrame {stream_id=5, type=HEADERS, flags=END_HEADERS, payload=...}
# [I 1018/103047] H2EventFrame {stream_id=5, type=DATA, flags=END_STREAM, payload=...} # Image response

Notice how stream_id changes (1, 3, 5). These aren’t sequential, and they represent independent requests that are being processed concurrently over the same TCP connection. In HTTP/1.1, each of these would have required a separate TCP connection (or a long wait for connection reuse), leading to head-of-line blocking.

HTTP/2 solves this with two key mechanisms: multiplexing and header compression.

Multiplexing is the ability to send multiple requests and responses concurrently over a single TCP connection. It works by breaking down HTTP messages into smaller, independent units called frames. Each frame has a stream identifier, allowing the client and server to reassemble the messages correctly, even if they arrive out of order. This eliminates the head-of-line blocking problem inherent in HTTP/1.1, where a slow response could hold up all subsequent requests on the same connection.

Header compression, specifically HPACK (HTTP/2 header compression), is crucial. HTTP headers can be quite verbose, especially with cookies. In HTTP/1.1, these headers are sent as plain text with every request, leading to significant overhead. HPACK uses a clever encoding scheme that maintains a table of previously seen headers. Subsequent requests only need to send a small index or a few changed fields, dramatically reducing the amount of data transferred. This is particularly impactful on high-latency networks or for clients with many concurrent requests.

Consider a typical HTTP/1.1 request for a resource. It might look like this:

GET /styles.css HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 ...
Accept: text/css,*/*;q=0.1
... (many more headers)

And a subsequent request for an image:

GET /logo.png HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 ...
Accept: image/png,image/*;q=0.8
... (many more headers, largely identical to the first)

With HPACK, the second request’s headers would be significantly smaller. The server and client maintain a shared context. If Host: example.com and User-Agent were sent in the first request, the second request might only send a few bytes to indicate which indexed headers to use and any new or modified ones.

The way HPACK achieves this is by using a combination of static and dynamic tables. The static table contains frequently used headers (like GET, POST, HTTP/1.1). The dynamic table is built up during the connection based on headers actually sent. Each header can be encoded as:

An index into the static table.
An index into the dynamic table.
A literal value, which then gets added to the dynamic table.

This allows for extremely efficient encoding, especially when many requests share common header fields.

The consequence of multiplexing and header compression is a dramatically reduced latency for fetching web assets. Instead of establishing multiple TCP connections and sending verbose, repetitive headers for each, HTTP/2 can establish one connection and efficiently stream all the necessary data. This is why you see faster page load times and a snappier user experience, especially on mobile or in regions with poor network conditions.

The most surprising thing about HPACK’s efficiency is that it’s not just about sending less data; it’s about how it intelligently reuses information. The dynamic table isn’t just a cache; it’s a shared state that both the client and server actively update, allowing them to infer large amounts of header information from very small encoded messages.

The next frontier in HTTP performance is HTTP/3, which aims to solve the remaining head-of-line blocking that can still occur at the TCP level.