Rate Limiting with Kong: Plugin Config and Testing (2026)

Kong’s rate limiting plugin is a surprisingly flexible tool, but its true power lies in understanding that it’s not just about blocking requests; it’s about shaping and controlling the flow of traffic to protect your services.

Here’s Kong in action, throttling a simulated API.

# Start a Kong instance (Docker example)
docker run -d \
  --name kong \
  -e "KONG_DATABASE=off" \
  -e "KONG_PROXY_LISTEN=0.0.0.0:8000" \
  -e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
  kong

# Install the rate-limiting plugin (assuming Kong is running and accessible)
# This requires an admin API call, typically via `curl` or a Kong client library.
# For demonstration, let's assume the plugin is already enabled in the Kong image.

# Configure a rate-limiting plugin for a specific service.
# Let's say we have a service with ID 'my-service-id' and want to limit it.
curl -X POST http://localhost:8001/plugins \
  --data "name=rate-limiting" \
  --data "service.id=my-service-id" \
  --data "config.policy=local" \
  --data "config.limit=5" \
  --data "config.period=10" \
  --data "config.message='Too many requests. Please try again later.'"

# Now, let's simulate requests to this service (assuming it's running on port 8000)
# We'll use a loop to exceed the limit quickly.

echo "Sending requests to exceed the rate limit..."
for i in {1..7}; do
  curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8000/some-endpoint
  sleep 0.5 # Small delay to make it observable
done

In the output, you’ll see a series of 200 status codes followed by 429 (Too Many Requests) codes once the limit of 5 requests within a 10-second period is hit.

Kong’s rate limiting plugin works by defining policies and rules. The local policy means the rate limiting is handled by the Kong node itself, without relying on an external data store. The config.limit specifies the maximum number of requests allowed, and config.period defines the time window (in seconds) over which this limit is enforced. When the limit is exceeded, Kong returns a 429 status code with the configured message.

The real magic happens when you start defining rules. You can rate-limit based on various dimensions:

Consumer: If you’re using Kong’s authentication (e.g., API keys, OAuth2), you can limit requests per consumer.
IP Address: A common default, limiting requests from a specific client IP.
HTTP Headers: You can use the value of a specific header (e.g., X-API-Key) as a unique identifier for rate limiting.
HTTP Methods: Limit POST requests differently from GET requests.
URIs: Apply different limits to different API endpoints.
Service/Route: As shown in the example, you can apply limits globally to a service or more granularly to a specific route.

The configuration allows for complex scenarios. For instance, you can set up multiple rules within a single plugin instance, each with its own limit, period, and identifier. Kong evaluates these rules sequentially.

{
  "name": "rate-limiting",
  "service": { "id": "my-service-id" },
  "config": {
    "policy": "local",
    "limits": [
      {
        "rate": 100,
        "period": 60,
        "identifier": "ip"
      },
      {
        "rate": 10,
        "period": 10,
        "identifier": "consumer"
      }
    ]
  }
}

This configuration applies two limits: a global limit of 100 requests per minute per IP address, and a stricter limit of 10 requests per 10 seconds per authenticated consumer. Both limits are enforced simultaneously.

The identifier field is crucial. It tells Kong what to count against the limit. ip uses the client’s IP, consumer uses the authenticated consumer’s ID, and header uses a specific header’s value. You can even combine these using a comma-separated string, like ip,header:X-Customer-ID, to create a composite identifier.

A subtle but powerful aspect of Kong’s rate limiting is its ability to use different policies. While local is simple and fast for single-node deployments, redis policy synchronizes rate limits across multiple Kong nodes in a cluster, ensuring consistent enforcement. This is vital for high-availability setups. To use Redis, you’d configure config.policy=redis and provide your Redis connection details via environment variables (KONG_REDIS_HOST, KONG_REDIS_PORT, etc.).

When you’re testing, remember that the period is a rolling window. If you set a limit of 5 requests per 10 seconds, Kong doesn’t reset the count at the exact 10-second mark. Instead, it considers requests made within the last 10 seconds. This means if you make 5 requests at t=0s, and then one request at t=9s, you’ll hit the limit. The next request is only allowed at t=10s if the first request was made at t=0s.

The most common pitfall is misunderstanding the policy and identifier interactions. For example, if you configure policy=local and identifier=consumer but haven’t set up authentication for your service, Kong won’t have a consumer ID to key off of, and the limit might not be applied as expected (or might default to IP if no consumer is found). Always ensure your authentication mechanism is correctly configured and tied to the service where the rate-limiting plugin is applied.

The next hurdle you’ll likely encounter is implementing more sophisticated rate-limiting strategies, such as dynamic limits based on real-time service load or custom logic for throttling specific types of requests.