All Articles

AWS RDS Storage Full — Every Fix and Prevention (2026)

RDS instance hits 100% storage and your database goes read-only. Here's the immediate fix, how to prevent it with autoscaling, how to monitor free storage, and what's eating your disk.

DevOpsBoysApr 28, 20265 min read
Share:Tweet

Your RDS instance storage hits 100% and the database goes read-only. Writes fail. Your application throws errors. Here's the immediate fix and how to prevent it from happening again.


What Happens When RDS Storage Is Full

When an RDS instance runs out of storage:

  1. PostgreSQL / MySQL switches to read-only mode
  2. All write operations (INSERT, UPDATE, DELETE) fail with errors like:
    • PostgreSQL: ERROR: could not extend file: No space left on device
    • MySQL: ERROR 1114 (HY000): The table is full
  3. New connections may also fail

Alert threshold: RDS will trigger a low storage space event when free space drops below 10% or 200MB (whichever is smaller) for Provisioned IOPS storage, or various thresholds for gp2/gp3.


Immediate Fix: Increase Storage

Via AWS Console:

  1. RDS → Databases → Your instance → Modify
  2. Allocated Storage → increase by 20–30%
  3. Apply immediately (if Multi-AZ) or during next maintenance window

Via AWS CLI:

bash
# Check current storage
aws rds describe-db-instances \
  --db-instance-identifier my-db \
  --query 'DBInstances[0].AllocatedStorage'
 
# Increase storage (can only increase, not decrease)
aws rds modify-db-instance \
  --db-instance-identifier my-db \
  --allocated-storage 200 \
  --apply-immediately
 
# Monitor the modification
aws rds describe-db-instances \
  --db-instance-identifier my-db \
  --query 'DBInstances[0].PendingModifiedValues'

Important: Storage modification is online for Multi-AZ instances. For Single-AZ, there may be a brief restart.

After increase, the database automatically returns to read-write mode once storage is available.


Fix Storage Full Immediately (Without Resizing)

If you need to free space NOW before you can resize:

PostgreSQL — Free up dead tuples with VACUUM

sql
-- Check table bloat
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
  pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
  n_dead_tup
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;
 
-- Run VACUUM on bloated tables
VACUUM FULL ANALYZE large_table_name;
 
-- Check WAL archiving isn't stuck (can eat disk fast)
SELECT * FROM pg_stat_archiver;

MySQL — Free space

sql
-- Find largest tables
SELECT
  table_schema,
  table_name,
  ROUND((data_length + index_length) / 1024 / 1024) AS size_mb
FROM information_schema.TABLES
ORDER BY size_mb DESC
LIMIT 10;
 
-- Truncate or delete old data from large tables
DELETE FROM audit_logs WHERE created_at < DATE_SUB(NOW(), INTERVAL 90 DAY);
 
-- Optimize table (reclaims fragmented space)
OPTIMIZE TABLE audit_logs;

Root Cause: What's Eating Your Storage

1. Unbounded Table Growth

The most common cause. A table (logs, events, sessions) grows forever with no deletion or archival policy.

sql
-- PostgreSQL: find tables growing fastest
SELECT
  relname,
  pg_size_pretty(pg_total_relation_size(relid)) AS size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;

Fix: Implement data retention policies:

sql
-- PostgreSQL: delete old records automatically
DELETE FROM application_logs
WHERE created_at < NOW() - INTERVAL '90 days';
 
-- Or use table partitioning for efficient deletion
CREATE TABLE logs_2026_03 PARTITION OF logs
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
 
-- Drop old partition instantly (no lock, no bloat)
DROP TABLE logs_2026_01;

2. Binary Logs (MySQL)

MySQL binary logs (binlog) used for replication can accumulate rapidly.

bash
# Check binlog size on RDS
mysql -e "SHOW BINARY LOGS;"
 
# Check RDS parameter for retention
aws rds describe-db-parameters \
  --db-parameter-group-name my-parameter-group \
  --query "Parameters[?ParameterName=='binlog_retention_hours']"
sql
-- Set binlog retention (RDS-specific stored procedure)
CALL mysql.rds_set_configuration('binlog retention hours', 24);

3. PostgreSQL WAL Files

Write-Ahead Log files can pile up if archiving is stuck or a replication slot is not consumed.

sql
-- Check replication slots (can prevent WAL cleanup)
SELECT slot_name, restart_lsn, pg_size_pretty(
  pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS retained_wal
FROM pg_replication_slots;
 
-- Drop stale replication slot (if no longer needed)
SELECT pg_drop_replication_slot('stale_slot_name');

4. Unused Indexes

Large indexes on rarely-queried tables.

sql
-- PostgreSQL: find large unused indexes
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
 
-- Drop unused index
DROP INDEX CONCURRENTLY unused_index_name;

Enable RDS Storage Auto Scaling (Prevention)

This is the most important prevention step. Enable storage autoscaling so RDS automatically expands when free space drops.

bash
aws rds modify-db-instance \
  --db-instance-identifier my-db \
  --max-allocated-storage 1000 \
  --apply-immediately

With --max-allocated-storage set, RDS automatically increases storage when:

  • Free space < 10% of allocated storage
  • Free space < 200MB
  • Low storage state persists for at least 5 minutes

Via Terraform:

hcl
resource "aws_db_instance" "main" {
  identifier        = "my-db"
  engine            = "postgres"
  instance_class    = "db.t3.medium"
  allocated_storage = 100
  
  max_allocated_storage = 1000  # Enable autoscaling up to 1TB
  
  # ... other config
}

CloudWatch Alarms for Storage

Set up alerts before you hit 100%:

bash
# Alert at 20% free storage remaining
aws cloudwatch put-metric-alarm \
  --alarm-name "RDS-LowStorage-my-db" \
  --alarm-description "RDS free storage below 20%" \
  --metric-name FreeStorageSpace \
  --namespace AWS/RDS \
  --statistic Average \
  --dimensions Name=DBInstanceIdentifier,Value=my-db \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 21474836480 \
  --comparison-operator LessThanThreshold \
  --alarm-actions arn:aws:sns:ap-south-1:123456789:ops-alerts
 
# 21474836480 bytes = 20GB
# Adjust threshold based on your total storage

Terraform:

hcl
resource "aws_cloudwatch_metric_alarm" "rds_low_storage" {
  alarm_name          = "rds-low-storage-${var.db_identifier}"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = 2
  metric_name         = "FreeStorageSpace"
  namespace           = "AWS/RDS"
  period              = 300
  statistic           = "Average"
  threshold           = 20 * 1024 * 1024 * 1024  # 20GB in bytes
  
  dimensions = {
    DBInstanceIdentifier = var.db_identifier
  }
  
  alarm_actions = [aws_sns_topic.ops_alerts.arn]
}

Grafana Dashboard Query (Prometheus RDS Exporter)

promql
# Free storage as percentage
(
  aws_rds_free_storage_space_average 
  / 
  (aws_rds_free_storage_space_average + aws_rds_used_storage_space_average)
) * 100

Alert at < 20%:

yaml
groups:
- name: rds
  rules:
  - alert: RDSLowFreeStorage
    expr: |
      (aws_rds_free_storage_space_average / 
       (aws_rds_free_storage_space_average + aws_rds_used_storage_space_average)) * 100 < 20
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "RDS {{ $labels.dbinstance_identifier }} free storage < 20%"

Checklist: After Every Storage Full Incident

  • Storage increased (immediate fix)
  • Autoscaling enabled with max limit set
  • CloudWatch alarm added at 20% threshold
  • Large tables identified and retention policy added
  • Replication slots checked for stale ones
  • Binary log retention set appropriately
  • Runbook updated with steps for next time

RDS storage issues are 100% preventable with autoscaling enabled and proper monitoring. The cost of a storage incident (downtime + emergency response) far exceeds the cost of proactively provisioning more storage.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments