Monitor RDS Freeable Memory: Alert Before OOM Kills (2026)

RDS Freeable Memory is the primary indicator that your database instance is running low on RAM, and if it drops to zero, the operating system will start killing processes (Out of Memory Killer, or OOM Killer) to reclaim memory, which can lead to your database crashing.

Diagnosing Low Freeable Memory

When you see alerts about low freeable memory, the first thing to check is the actual memory usage on the instance.

Check RDS Metrics:
- Command: Use the AWS CLI or the RDS console to view the FreeableMemory metric.
- Diagnosis: A consistently low value (e.g., below 100MB) indicates an impending memory exhaustion.
Connect to the Instance (if possible) and Check OS Memory:
- Command: For MySQL/MariaDB, use SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_free_percent'; or SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_size';. For PostgreSQL, use SELECT pg_postmaster_pid(); to get the PID and then sudo ps -p <PID> -o rss,vsz on the EC2 instance (if accessible) or check OS-level tools like free -m or top.
- Diagnosis: Compare OS-level free memory with RDS metrics. Often, RDS metrics are more readily available and accurate for the RDS environment, but OS-level checks can reveal if the underlying OS itself is starved.
Examine Database-Specific Memory Consumers:
- MySQL/MariaDB:
  - Command: SHOW GLOBAL VARIABLES LIKE '%buffer_pool%'; and SHOW GLOBAL VARIABLES LIKE '%sort_buffer%'; and SHOW GLOBAL VARIABLES LIKE '%join_buffer%';.
  - Diagnosis: Large values for innodb_buffer_pool_size are good, but if they are too large, they can starve other processes. Per-session buffers like sort_buffer_size and join_buffer_size can also consume significant memory if many connections are active.
- PostgreSQL:
  - Command: SHOW shared_buffers;, SHOW work_mem;, SHOW maintenance_work_mem;.
  - Diagnosis: shared_buffers is the primary cache. work_mem is used for sorting and hashing per operation, and maintenance_work_mem for VACUUM, CREATE INDEX, etc. High concurrency with large work_mem can lead to rapid memory depletion.

Common Causes and Fixes

Insufficient Instance Memory: The instance is simply too small for the workload.
- Diagnosis: If all other parameters are reasonably sized and you’re still hitting limits, the instance class is the bottleneck.
- Fix: Modify the RDS instance to a larger class (e.g., db.r5.large to db.r5.xlarge). This is a straightforward scaling operation.
- Why it works: A larger instance class provides more RAM, giving the database more headroom before hitting OS limits.
Large innodb_buffer_pool_size (MySQL/MariaDB): While crucial for performance, if set too high relative to available RAM, it can starve the OS.
- Diagnosis: Check innodb_buffer_pool_size. A common recommendation is 70-80% of instance RAM, but this can be too aggressive on smaller instances or those running other services.
- Fix: Reduce innodb_buffer_pool_size. For example, if you have a db.r5.large (30.5 GiB RAM) and it’s set to 24GB, try reducing it to 18GB (e.g., 19200M). Apply this via RDS Parameter Groups.
- Why it works: Less memory allocated to the InnoDB buffer pool leaves more for the OS and other database processes, preventing immediate OOM conditions.
High work_mem and High Concurrency (PostgreSQL): Each active query that requires sorting or hashing can consume work_mem, leading to rapid memory exhaustion with many concurrent queries.
- Diagnosis: Monitor active connections and observe queries that might be doing large sorts. pg_stat_activity can show running queries.
- Fix: Reduce work_mem. For instance, if work_mem is set to 32MB and you have 100 concurrent connections, that’s potentially 3.2GB just for work_mem operations. Try reducing it to 8MB or 16MB. Apply via RDS Parameter Groups.
- Why it works: Lowering work_mem reduces the memory footprint of individual query operations, allowing more concurrent operations to run without exceeding total available RAM.
Excessive Per-Connection Buffers (MySQL/MariaDB): Settings like sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size are allocated per connection. High connection counts with large values here are a memory killer.
- Diagnosis: Check these variable values and compare them to the number of active connections (SHOW GLOBAL STATUS LIKE 'Threads_connected';).
- Fix: Reduce these per-connection buffer sizes. For example, if sort_buffer_size is 2M and you have 200 connections, that’s 400MB just for sorts. Consider reducing it to 256K or 512K. Apply via RDS Parameter Groups.
- Why it works: Decreasing per-connection allocations frees up significant memory when the number of connections is high, as the total memory consumed by these buffers is a multiplier of the connection count.
Memory Leaks or Inefficient Queries: Custom application code or poorly optimized queries can lead to sustained high memory usage.
- Diagnosis: Use database performance monitoring tools (e.g., RDS Performance Insights) to identify slow queries or processes consuming high amounts of memory. Look for long-running transactions or queries that perform full table scans on large tables.
- Fix: Optimize queries (add indexes, rewrite logic) or address application-level memory leaks. This is application-dependent.
- Why it works: Efficient queries and code use less memory and CPU, reducing the overall demand on the instance’s resources.
Unused Database Connections: A large number of idle connections can still consume memory due to per-connection overhead.
- Diagnosis: Monitor Threads_connected and Threads_running in MySQL/MariaDB, or pg_stat_activity in PostgreSQL. High Threads_connected with low Threads_running indicates idle connections.
- Fix: Tune max_connections in your parameter group and/or implement connection pooling in your application. For MySQL/MariaDB, reducing max_connections from a high default (e.g., 151) to a more realistic value (e.g., 50-100) can save considerable memory.
- Why it works: Limiting the maximum number of connections directly caps the memory overhead associated with each connection.

The Next Error

Once you’ve resolved your Freeable Memory issues, you’ll likely encounter an error related to disk I/O or network throughput if your database workload has also been bottlenecked by those resources, or a performance degradation on specific queries that were previously being held back by memory constraints.