AWS CloudFormation Stack Stuck in ROLLBACK_FAILED: Fix It Now
CloudFormation stack stuck in ROLLBACK_FAILED or UPDATE_ROLLBACK_FAILED state? Here's every cause and the exact steps to recover without losing your resources.
A CloudFormation stack in ROLLBACK_FAILED is one of the most frustrating AWS situations. You can't update it. You can't delete it normally. You can't redeploy over it. And the AWS console often gives you a vague error message that doesn't help.
Here's exactly how to diagnose and fix it.
Understanding the States
CREATE attempt fails → CREATE_FAILED → stack is deleted automatically
UPDATE attempt fails → UPDATE_ROLLBACK_IN_PROGRESS
→ if rollback succeeds: UPDATE_ROLLBACK_COMPLETE (safe)
→ if rollback fails: UPDATE_ROLLBACK_FAILED (stuck)
DELETE attempt fails → DELETE_FAILED (stuck)
UPDATE_ROLLBACK_FAILED is the most common stuck state. It means: your update failed, AND the attempt to revert to the previous state also failed.
Step 1 — Find the Root Cause
In the AWS Console → CloudFormation → your stack → Events tab.
Filter by Status Reason and look for the red FAILED events. Read from bottom to top — the earliest failure is the root cause.
# Via CLI
aws cloudformation describe-stack-events \
--stack-name my-stack \
--query 'StackEvents[?ResourceStatus==`UPDATE_ROLLBACK_FAILED`].[LogicalResourceId,ResourceStatusReason]' \
--output tableCommon error messages and what they mean:
| Error | Meaning |
|---|---|
Resource is not in the state rollback | Resource was manually modified outside CloudFormation |
DELETE_FAILED: DependencyViolation | Resource has dependencies that must be deleted first |
The following resource(s) failed to rollback: [MyBucket] | S3 bucket not empty, or resource was deleted manually |
Limit exceeded | Service quota hit during rollback |
Step 2 — Continue Update Rollback (Skip Failing Resources)
AWS provides a way to skip specific resources during rollback. This is safe when the resource was manually deleted or modified outside CloudFormation.
aws cloudformation continue-update-rollback \
--stack-name my-stack \
--resources-to-skip LogicalResourceId1 LogicalResourceId2Example — Skip a manually deleted S3 bucket:
aws cloudformation continue-update-rollback \
--stack-name my-stack \
--resources-to-skip MyS3BucketThis tells CloudFormation: "Skip this resource during rollback — assume it's fine."
After this succeeds, your stack goes to UPDATE_ROLLBACK_COMPLETE. You can then make a new update to fix the skipped resource.
Cause 1: Resource Manually Modified Outside CloudFormation
Someone went into the console and changed a security group, deleted a resource, or added a tag manually. Now CloudFormation can't revert it.
Diagnose:
aws cloudformation describe-stack-resource \
--stack-name my-stack \
--logical-resource-id MySecurityGroup \
--query 'StackResourceDetail.{Status:ResourceStatus,Reason:ResourceStatusReason}'Fix: Either skip the resource in continue-update-rollback, or manually revert the resource to its previous state before retrying:
# Option 1: Skip the resource
aws cloudformation continue-update-rollback \
--stack-name my-stack \
--resources-to-skip MySecurityGroup
# Option 2: Import the manually-changed resource back into CloudFormation
aws cloudformation create-change-set \
--stack-name my-stack \
--change-set-name import-fix \
--change-set-type IMPORT \
--resources-to-import '[{"ResourceType":"AWS::EC2::SecurityGroup","LogicalResourceId":"MySecurityGroup","ResourceIdentifier":{"GroupId":"sg-abc123"}}]' \
--template-body file://template.yamlCause 2: S3 Bucket Not Empty
CloudFormation can't delete a non-empty S3 bucket. If your stack update involves deleting a bucket, and the bucket has objects, rollback fails.
Fix — Empty the bucket first:
# List and delete all versions
aws s3api list-object-versions --bucket my-bucket \
--query '{Objects: Versions[].{Key:Key,VersionId:VersionId}, Quiet: true}' \
--output json > delete.json
aws s3api delete-objects --bucket my-bucket --delete file://delete.json
# Delete delete markers
aws s3api list-object-versions --bucket my-bucket \
--query '{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}, Quiet: true}' \
--output json > deletemarkers.json
aws s3api delete-objects --bucket my-bucket --delete file://deletemarkers.json
# Now retry rollback or delete
aws cloudformation continue-update-rollback --stack-name my-stackCause 3: Stack in DELETE_FAILED State
You tried to delete the stack and it failed. Common reasons:
- Non-empty S3 bucket
- RDS with deletion protection enabled
- VPC with attached resources (IGW, subnets with ENIs)
Fix — Delete with resource retention:
# Delete stack but keep specific resources
aws cloudformation delete-stack \
--stack-name my-stack \
--retain-resources MyS3Bucket MyRDSInstanceThese resources stay in your account but are removed from the stack. Clean them up manually.
Fix — Disable deletion protection first:
# For RDS
aws rds modify-db-instance \
--db-instance-identifier my-db \
--no-deletion-protection
# Then retry stack deletion
aws cloudformation delete-stack --stack-name my-stackCause 4: Service Quota Exceeded During Rollback
Rollback tried to re-create resources but hit an AWS service limit.
Diagnose:
aws cloudformation describe-stack-events \
--stack-name my-stack \
--query 'StackEvents[?contains(ResourceStatusReason, `limit`) || contains(ResourceStatusReason, `Limit`)].[LogicalResourceId,ResourceStatusReason]' \
--output tableFix:
- Request a limit increase via Service Quotas console
- Or skip the resource and add it back manually:
aws cloudformation continue-update-rollback \
--stack-name my-stack \
--resources-to-skip MyResourceNuclear Option: Force Delete a Stuck Stack
If nothing works and you need to remove the stack:
# AWS CLI v2 supports --deletion-mode
aws cloudformation delete-stack \
--stack-name my-stack \
--deletion-mode FORCE_DELETE_STACKWarning: This force-deletes the CloudFormation stack record but does NOT delete the underlying AWS resources. You'll need to clean those up manually.
Prevention Checklist
- Never modify CloudFormation-managed resources manually — always update via template
- Enable termination protection on production stacks:
aws cloudformation update-termination-protection \ --enable-termination-protection \ --stack-name my-stack - Use DeletionPolicy: Retain for stateful resources (S3, RDS, DynamoDB):
MyBucket: Type: AWS::S3::Bucket DeletionPolicy: Retain UpdateReplacePolicy: Retain - Test updates in a staging stack first — never test risky changes in prod
CloudFormation stuck states are recoverable in 99% of cases. The key is reading the event log carefully to find the actual failing resource, then skipping or fixing it before retrying rollback.
For AWS troubleshooting labs and certification prep, KodeKloud has hands-on AWS courses with real AWS environments.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS ALB 504 Gateway Timeout — Every Cause and Fix (2026)
Your ALB returns 504 Gateway Timeout but the app seems fine. Here's every reason this happens — backend timeouts, keepalive mismatches, health check failures — and exactly how to fix each one.
AWS ALB Showing Unhealthy Targets — How to Fix It
Fix AWS Application Load Balancer unhealthy targets. Covers health check misconfigurations, security group issues, target group problems, and EKS-specific ALB controller debugging.
AWS CloudFront 403 Forbidden — Every Cause and Fix (2026)
CloudFront returns 403 Forbidden but your S3 bucket or origin looks fine. Here's every cause — OAC misconfiguration, bucket policy missing, wrong origin domain, geo-restriction — and the exact fix.