The RDS database instance failed to accept new connections because the max_connections limit was reached, preventing the application from performing its work.

This usually happens because the application is either opening too many connections and not closing them, or the max_connections setting is simply too low for the workload.

Cause 1: Application Leaking Connections

  • Diagnosis: Check your application logs for frequent "too many connections" errors. Use a monitoring tool like Datadog, New Relic, or CloudWatch to visualize the RDS Connections metric. If the number of active connections consistently hovers near the max_connections limit, and spikes rapidly, it’s a strong indicator of a leak.
    # Example using AWS CLI to get current connection count
    aws rds describe-db-connections --db-instance-identifier your-db-instance-name --query 'Connections[*].DBConnectionId' --output text | wc -l
    
  • Fix: Review your application’s connection pooling configuration. Ensure connections are properly closed or returned to the pool when no longer needed. For example, in Java with HikariCP, ensure connection.close() is called within a try-finally block or use try-with-resources. In Python with psycopg2, ensure connections are closed (conn.close()) and cursors are closed (cur.close()).
  • Why it works: By explicitly closing or returning connections to the pool, you reduce the number of actively held connections, preventing the max_connections limit from being hit due to unreleased resources.

Cause 2: Insufficient max_connections Setting

  • Diagnosis: Analyze the RDS Connections metric in CloudWatch. If the average number of connections is consistently high, approaching the max_connections limit, and there aren’t obvious connection leaks, the limit itself is likely too low for your application’s normal operation. You can also check the max_connections parameter directly:
    -- Connect to your RDS instance using psql, mysql client, etc.
    SHOW max_connections;
    
  • Fix: Increase the max_connections parameter in your RDS instance’s parameter group. The optimal value depends on your instance class and workload. A common starting point for larger instances (e.g., db.r5.large or higher) is often between 200 and 500, but this can go much higher for very busy systems.
    1. Go to RDS console -> Parameter groups.
    2. Select your DB instance’s parameter group or create a new one.
    3. Click "Edit parameters".
    4. Search for max_connections.
    5. Change the value (e.g., from 100 to 300).
    6. Save changes.
    7. Important: You must reboot your DB instance for the change to take effect.
  • Why it works: Increasing the max_connections parameter allows more simultaneous connections to be established to the database, accommodating your application’s normal demand.

Cause 3: High Number of Idle Connections

  • Diagnosis: Even if your application claims to close connections, it might be leaving them in an idle state for extended periods. Monitor the RDS Connections metric and look at the breakdown of active vs. idle connections if your monitoring tool supports it. You can also query the database directly:
    -- For PostgreSQL
    SELECT count(*) FROM pg_stat_activity WHERE state = 'idle';
    
    -- For MySQL
    SHOW PROCESSLIST; -- Look for 'Sleep' state
    SELECT count(*) FROM information_schema.processlist WHERE command = 'Sleep';
    
  • Fix: Configure your application’s connection pool to have a lower idle_timeout. For example, in HikariCP, set idleTimeout (e.g., to 30000 ms or 30 seconds). For other pools, consult their documentation. Also, consider setting a wait_timeout or interactive_timeout on the database side if applicable (though max_connections is the primary RDS limit).
  • Why it works: By reducing the time connections can remain idle in the pool or on the database, you free up connection slots that are no longer actively being used, making them available for new requests.

Cause 4: Too Many Application Instances/Threads

  • Diagnosis: If you’ve recently scaled up your application tier (e.g., added more EC2 instances, increased the number of containers, or increased thread counts per instance) without adjusting database connection limits or pooling, you might be overwhelming the database. Correlate spikes in RDS Connections with application scaling events.
  • Fix: Adjust your application’s connection pool size. Each application instance/thread should ideally share a connection pool. Ensure the total number of connections requested by all application instances at peak load does not exceed the max_connections setting. You might need to reduce the maximumPoolSize in your connection pool configuration or scale down the application tier if the database is the bottleneck.
  • Why it works: By controlling the number of connections each application instance can open, and ensuring the sum of these potential connections across all instances is less than max_connections, you prevent the database from being swamped by too many clients.

Cause 5: Database Parameter Group Not Applied/Rebooted

  • Diagnosis: You’ve modified max_connections in a parameter group, but the connections metric hasn’t changed, and the SHOW max_connections; command returns the old value.
  • Fix: Ensure the correct parameter group is associated with your RDS instance. Then, reboot the RDS instance for parameter changes to take effect. This is a crucial step that’s often overlooked.
  • Why it works: RDS applies many dynamic parameters immediately, but max_connections is a static parameter that requires an instance reboot to reinitialize the database process with the new limit.

Cause 6: Monitoring Lag or Incorrect Metric Interpretation

  • Diagnosis: You’re seeing connection errors, but the RDS Connections metric in CloudWatch shows a much lower number than your max_connections limit. This can happen if there’s a significant delay in metric reporting or if you’re looking at average connections over a long period instead of peak connections.
  • Fix: Adjust your CloudWatch metric resolution to 1 minute for more granular data. Examine the Maximum statistic for the RDS Connections metric over short intervals (e.g., 5-15 minutes) to catch brief spikes. Also, cross-reference with the Aurora Connections (if applicable) or directly query pg_stat_activity or information_schema.processlist for near real-time counts.
  • Why it works: By using higher resolution and focusing on peak values, you can accurately identify connection bursts that might be missed by default or averaged metrics, allowing you to pinpoint the exact moment the limit was hit.

The next error you’ll likely encounter after fixing max_connections is a slow query issue if the database is now overloaded with too many active (not just open) queries.

Want structured learning?

Take the full Rds course →