Rate Limiting in FastAPI: slowapi and Custom Middleware (2026)

FastAPI’s rate limiting isn’t a built-in feature, but it’s surprisingly easy to implement, and slowapi is a popular way to do it.

Let’s see it in action. Imagine you have a FastAPI app and you want to ensure users don’t hit your /items endpoint more than 5 times per minute.

from fastapi import FastAPI, Request
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

app = FastAPI()

# Configure the Limiter
limiter = Limiter(key_func=get_remote_address)

# Apply the limiter to specific routes
@app.get("/items")
@limiter.limit("5/minute")
async def read_items():
    return {"message": "You can access items!"}

# Handle rate limit exceeded errors
@app.exception_handler(RateLimitExceeded)
async def rate_limit_exceeded_handler(request: Request, exc: RateLimitExceeded):
    return await _rate_limit_exceeded_handler(request, exc)

# A route without rate limiting for comparison
@app.get("/public")
async def public_route():
    return {"message": "This is a public route."}

When a client makes more than 5 requests to /items within a minute from the same IP address, slowapi will intercept the request. The _rate_limit_exceeded_handler is invoked, and you’ll receive a 429 Too Many Requests response. The get_remote_address function is crucial here; it tells slowapi who to track limits for. By default, it uses the client’s IP address.

The core problem slowapi solves is preventing abuse and ensuring fair usage of your API resources. Without it, a single user or bot could overwhelm your server, impacting performance for everyone else. slowapi provides a flexible way to define these limits.

Internally, slowapi uses a storage backend to keep track of request counts. By default, it uses an in-memory store, which is fast but not persistent across application restarts. For production, you’ll want to configure it to use something like Redis.

Here’s how you’d configure Redis:

from slowapi import Limiter
from slowapi.util import get_remote_address
from slowapi.storage.redis_storage import RedisStorage
import redis

# Connect to Redis
redis_client = redis.Redis(host='localhost', port=6379, db=0)

# Configure the Limiter with Redis storage
limiter = Limiter(
    key_func=get_remote_address,
    storage_uri="redis://localhost:6379/0" # Or use redis_client directly
)

The storage_uri tells slowapi where to find your Redis instance. If you omit it, it defaults to memory://. Using Redis is essential for distributed systems or when you need your rate limits to survive application restarts.

You can define limits in various granularities:

Global limits: Apply to all routes.
Route-specific limits: As shown with @limiter.limit("5/minute").
Per-user limits: If you have authentication, you can use request.user.id as the key.
Complex rules: slowapi supports more than just simple time-based limits, like 100/hour, 1/day, or even 30/10m (30 requests per 10 minutes).

The most surprising thing is how easily you can bypass the default get_remote_address logic if you’re behind a proxy or load balancer that terminates SSL and rewrites headers. If your X-Forwarded-For headers aren’t correctly configured or trusted, slowapi might end up seeing the IP of your proxy, not the actual client, meaning all your users appear to come from a single IP and share the same rate limit.

To handle this, you’d often customize the key_func:

from fastapi import FastAPI, Request
from slowapi import Limiter
from slowapi.util import get_remote_address

app = FastAPI()

def get_real_remote_address(request: Request):
    # Try to get IP from X-Forwarded-For header,
    # trusting it only if behind a known proxy.
    # This is a simplified example; robust solutions involve
    # configuring trusted proxies.
    if request.headers.get("X-Forwarded-For"):
        return request.headers.get("X-Forwarded-For").split(",")[0]
    return get_remote_address(request)

limiter = Limiter(key_func=get_real_remote_address)

@app.get("/secure")
@limiter.limit("10/minute")
async def secure_endpoint():
    return {"message": "Secure data"}

This get_real_remote_address function attempts to extract the original client IP from the X-Forwarded-For header. In a real-world scenario, you’d need a more robust way to determine if you trust that header, often by configuring your web server or load balancer to set it only for trusted upstream IPs.

The next step is often integrating custom logic into your rate limiting, perhaps allowing certain authenticated users higher limits or applying different rules based on the request payload.