Postgres doesn’t just "run out" of connections; it actively refuses new ones when max_connections is hit, leading to a hard stop for your application.
Let’s see this in action. Imagine your application is trying to connect, and you see this in your Postgres logs:
FATAL: sorry, too many clients already
This means the connection limit has been reached. The immediate problem is that no new users or services can talk to your database. The deeper problem is that your current load exceeds the capacity you’ve allocated, or your application is holding onto connections it no longer needs.
Why Are We Hitting the Limit?
This isn’t usually a single-cause issue. It’s a symptom of how your application interacts with Postgres under load.
-
Application Connection Pooling Misconfiguration:
- Diagnosis: Check your application’s connection pool settings. Look for parameters like
max_pool_size,connection_limit, orpool_size. If your application framework (e.g., Spring Boot with HikariCP, Django with PgBouncer) is configured to open more connections than your Postgresmax_connectionsallows, or if it’s not releasing them properly, you’ll hit the limit. - Fix: Reduce the
max_pool_sizein your application’s configuration. For example, if your application has 10 instances and each is set tomax_pool_size = 20, that’s 200 potential connections. If yourmax_connectionsinpostgresql.confis set to 150, you’ll run into trouble. Try reducingmax_pool_sizeto15per instance, for a total of 150. - Why it works: The connection pool manages a set of active connections. By limiting the pool size, you ensure the application never requests more connections than Postgres can handle.
- Diagnosis: Check your application’s connection pool settings. Look for parameters like
-
Too Many Application Instances/Workers:
- Diagnosis: Are you running more copies of your application than you intended? Each instance, even if its pool size is reasonable, contributes to the total connection count. Check your deployment orchestration (Kubernetes, Docker Swarm, EC2 Auto Scaling Groups) for the number of running application pods or instances.
- Fix: Scale down the number of application instances. If you have 50 application instances and each is configured with a pool of 5 connections, you’re already at 250 connections before considering idle connections or other clients. Reduce the number of instances to, say, 30.
- Why it works: Fewer application instances mean fewer distinct processes attempting to establish connections to the database, directly lowering the total concurrent connection count.
-
Idle Connections Not Being Closed:
- Diagnosis: Applications are supposed to return connections to the pool when they’re done with them. If there’s a bug or a long-running process that holds a connection open for too long without using it, these "idle" connections still count against
max_connections. Usepg_stat_activityto look for connections in anidleoridle in transactionstate that have been open for a long time.SELECT pid, datname, usename, client_addr, state, backend_start, state_change FROM pg_stat_activity WHERE state = 'idle' AND now() - state_change > interval '5 minutes' ORDER BY state_change; - Fix: Implement connection timeouts in your application’s connection pool. For example, in HikariCP, set
idleTimeoutto60000(1 minute). Foridle in transactionstates, ensure your application code explicitly commits or rolls back transactions promptly. - Why it works: By setting an
idleTimeout, the connection pool will automatically close connections that haven’t been used for a specified period, freeing them up for new requests. Prompt transaction completion prevents holding connections indefinitely.
- Diagnosis: Applications are supposed to return connections to the pool when they’re done with them. If there’s a bug or a long-running process that holds a connection open for too long without using it, these "idle" connections still count against
-
Insufficient
max_connectionsSetting:- Diagnosis: This is the most direct cause. You’ve genuinely outgrown your configured limit. Check your
postgresql.conffile for themax_connectionsparameter.SHOW max_connections; - Fix: Increase
max_connectionsinpostgresql.conf. A common starting point is to set it to100or200for small to medium instances, but this needs to be balanced. You’ll need to restart Postgres for this change to take effect.# postgresql.conf max_connections = 200 - Why it works: You’re literally telling Postgres it’s allowed to accept more simultaneous client connections. However, be aware that each connection consumes memory.
- Diagnosis: This is the most direct cause. You’ve genuinely outgrown your configured limit. Check your
-
Underlying System Resource Constraints (RAM):
- Diagnosis:
max_connectionsis directly tied to available RAM. Each connection requires memory for its backend process, buffers, and query execution. If you increasemax_connectionswithout enough RAM, your server will start swapping, leading to severe performance degradation or even OOM (Out Of Memory) killer events. Checkfree -horhtopon your database server. - Fix: Scale up your database server’s RAM or reduce
max_connectionsto a value that fits within your current RAM. A rough guideline is that each connection might consume 1-2MB of RAM just for the process overhead, plus shared buffers and work_mem. If you have 8GB RAM andmax_connections = 500, you’re likely in trouble if other memory is being used. - Why it works: More RAM allows Postgres to allocate the necessary memory for each new connection without impacting overall system stability or performance.
- Diagnosis:
-
Connection Leaks in Background Jobs/Scripts:
- Diagnosis: Sometimes, long-running background jobs, scheduled scripts, or maintenance tasks might open a connection and forget to close it. These can be harder to spot than application connections. Use
pg_stat_activityto look for connections from unexpected hosts or users, or those in states likeactivebut with very oldbackend_starttimes. - Fix: Review the code for all scripts and background jobs that interact with Postgres. Ensure they use
try...finallyblocks or context managers to guarantee connections are closed, or use a robust connection pooler like PgBouncer for these tasks as well. - Why it works: Explicitly closing connections or using a pooler ensures that these auxiliary processes don’t permanently consume connection slots.
- Diagnosis: Sometimes, long-running background jobs, scheduled scripts, or maintenance tasks might open a connection and forget to close it. These can be harder to spot than application connections. Use
Once you’ve addressed the "too many clients" error, your next immediate challenge will likely be canceling statement due to conflict with recovery if you’re in a replication scenario.