Nginx’s limit_req and limit_conn directives, while often discussed together, actually tackle two fundamentally different kinds of overload.
Let’s see limit_req in action. Imagine you have a signup form and you want to prevent brute-force attacks by limiting how often a single IP address can submit it.
http {
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;
server {
listen 80;
server_name example.com;
location /signup {
limit_req zone=mylimit burst=20 nodelay;
proxy_pass http://backend_server;
}
}
}
Here, $binary_remote_addr tells Nginx to track requests based on the client’s IP address. zone=mylimit:10m creates a shared memory zone named mylimit that’s 10 megabytes in size to store this tracking data. rate=10r/s sets the hard limit: no more than 10 requests per second on average. burst=20 allows a temporary burst of up to 20 requests that exceed the rate, and nodelay means those burst requests are processed immediately, rather than being delayed.
limit_conn is for controlling the number of simultaneous connections from a single IP, not the rate of requests. This is crucial for preventing resource exhaustion on your server if a client opens many connections and keeps them open without sending much data.
http {
limit_conn_zone $binary_remote_addr zone=addr:10m;
server {
listen 80;
server_name example.com;
location / {
limit_conn addr 10;
proxy_pass http://backend_server;
}
}
}
In this example, limit_conn_zone $binary_remote_addr zone=addr:10m; defines a zone named addr that tracks clients by IP, using 10MB of shared memory. limit_conn addr 10; then applies this zone to the / location, ensuring no single IP can maintain more than 10 simultaneous connections.
The core problem Nginx’s rate limiting solves is preventing a single client (or a coordinated group of clients) from overwhelming your server’s resources. This can happen through accidental misconfiguration, legitimate but high-traffic scenarios, or malicious attacks like DDoS or brute-force attempts. limit_req acts like a traffic cop for request frequency, while limit_conn acts like a bouncer for connection volume.
Internally, limit_req uses a token bucket algorithm. The rate defines how fast tokens are added to the bucket, and the burst defines the bucket’s capacity. Each incoming request consumes a token. If the bucket is empty, the request is either delayed (delay parameter) or rejected. limit_conn is simpler: it just counts active connections for a given key (like an IP address) and rejects new ones if the limit is hit.
The zone directive is critical. It specifies a shared memory zone where Nginx stores the state for all the clients being limited. Without it, each worker process would have its own independent limits, rendering the configuration useless for global limits. The size of the zone (e.g., 10m) needs to be sufficient to hold the state for all expected unique clients. A rough estimate is 64 bytes per entry (client IP and its state). So, 10MB can typically handle around 160,000 unique clients.
When limit_req is used with nodelay, Nginx doesn’t actually delay requests that exceed the rate. Instead, it immediately returns a 503 Service Temporarily Unavailable error. This is often preferred for API endpoints where clients expect a quick response, even if it’s an error, rather than waiting and potentially timing out. If you omit nodelay, Nginx will queue requests up to the burst limit and then serve them as capacity becomes available, effectively introducing latency.
The delay parameter in limit_req is an alternative to nodelay. If delay=5 is set, Nginx will delay requests that exceed the rate for up to 5 seconds. If the request can be served within that delay, it will be. Otherwise, it will be rejected with a 503 error. This is useful for scenarios where you want to absorb temporary spikes gracefully without immediately failing clients.
A common pitfall is using $server_name or $remote_port as the key for limit_req_zone. $server_name would limit all clients to the same rate for a given virtual host, which is rarely desired. $remote_port is also not useful as it changes per connection. Always use $binary_remote_addr for IP-based limiting, or a custom header like $http_x_forwarded_for if clients are behind a proxy and you want to limit based on the original client IP.
The delay_log_format directive can be useful for debugging. It allows you to specify a custom log format for requests that are delayed or rejected by limit_req. This helps you understand why requests are being throttled without sifting through generic error logs.
One thing most people don’t realize is that limit_req and limit_conn can be combined effectively. You might want to limit both the rate of requests and the number of simultaneous connections from a single IP. For instance, a client could theoretically open 100 connections and send requests at a rate of 1 per second per connection, staying under a limit_req of 100r/s but still hammering your server. Using both directives prevents this.
The next challenge you’ll face is implementing more sophisticated traffic shaping, such as different limits for different user agents or geographic locations, which often involves custom Nginx modules or Lua scripting.