Choosing the right RDS instance class is less about picking the biggest and best and more about a delicate balancing act between your workload’s demands and the instance’s capabilities, often leading to significant cost savings and performance improvements by not over-provisioning.

Let’s see this in action. Imagine a busy e-commerce platform running on a db.r5.xlarge RDS instance for its PostgreSQL database.

-- Example: Monitoring active connections and query execution time
SELECT
    pg_stat_activity.datname,
    pg_stat_activity.usename,
    pg_stat_activity.application_name,
    pg_stat_activity.client_addr,
    pg_stat_activity.state,
    NOW() - pg_stat_activity.query_start AS query_duration,
    pg_stat_activity.query
FROM
    pg_stat_activity
WHERE
    pg_stat_activity.state != 'idle'
ORDER BY
    query_duration DESC
LIMIT 10;

-- Example: Monitoring CPU utilization (PostgreSQL v13+)
SELECT
    date_trunc('minute', ts) AS datetime,
    cpu_user,
    cpu_system,
    cpu_idle,
    cpu_iowait
FROM
    pg_stat_monitor_metrics
WHERE
    ts >= NOW() - INTERVAL '1 hour'
ORDER BY
    ts DESC;

This setup might be handling 200 concurrent connections, with average query times around 50ms and peak CPU usage hitting 65%. The db.r5.xlarge has 4 vCPUs and 16 GiB of RAM.

The problem this solves is the common pitfall of over-provisioning. Teams often start with a larger instance class than they truly need, fearing performance bottlenecks. This leads to paying for unused CPU, memory, and network bandwidth, directly inflating cloud costs. Conversely, under-provisioning leads to slow queries, timeouts, and a poor user experience. Right-sizing aims to find the sweet spot.

Internally, an RDS instance class dictates the underlying hardware resources: vCPUs, memory, and network bandwidth. Different families (e.g., r for memory-optimized, m for general-purpose, c for compute-optimized) are tuned for specific workloads. The r5 family, for instance, offers a good balance of compute and memory, making it suitable for many general-purpose databases.

The key levers you control are:

  • Instance Class: Swapping from db.r5.xlarge to db.m5.xlarge if CPU is the bottleneck, or db.r5.large if memory is underutilized.
  • Storage Type and Size: Using General Purpose SSD (gp2 or gp3) for most workloads, or Provisioned IOPS SSD (io1/io2) for I/O-intensive applications.
  • Read Replicas: Offloading read traffic to replicas to reduce load on the primary instance.
  • Database Engine Specific Parameters: Tuning shared_buffers in PostgreSQL or innodb_buffer_pool_size in MySQL.

When you see a database instance showing consistently high CPU utilization (e.g., > 80% for sustained periods) but low memory usage, it’s a strong signal that the instance is compute-bound. Migrating to a compute-optimized instance class with similar memory but more vCPUs, like a db.c5.xlarge (4 vCPUs, 8 GiB RAM), could provide the necessary processing power without overspending on memory. Conversely, if memory utilization is consistently high (e.g., > 90%) and CPU is relatively low, you might benefit from a memory-optimized instance class like a db.r5.large (2 vCPUs, 16 GiB RAM), which offers more memory per vCPU. The choice hinges on analyzing CloudWatch metrics for CPU, Memory, Network In/Out, and Disk I/O.

The most surprising aspect of RDS instance sizing is how often the network becomes the limiting factor before CPU or memory, especially with high-throughput applications or when using read replicas extensively. Instance classes have defined network performance tiers, and exceeding these limits can manifest as slow query performance or increased latency, even if CPU and memory metrics look healthy. You might be running on a db.r5.xlarge which offers up to 5 Gbps network performance, but if your workload consistently pushes 8 Gbps, you’re hitting a ceiling. Migrating to an r5 instance class with enhanced networking, like db.r5.2xlarge (8 Gbps) or db.r5.4xlarge (10 Gbps), might be necessary, even if CPU and memory utilization suggest you could stay smaller.

Once you’ve successfully right-sized your instance, the next challenge is often optimizing the database queries themselves.

Want structured learning?

Take the full Rds course →