AWS ECS Service Still Running Old Task Definition After Update
ECS service ignoring your new task definition revision? Fix it with force-new-deployment, understand pinned ARNs vs LATEST, check the deployment circuit breaker, and verify which revision is actually running.
You registered a new ECS task definition revision. The console says revision 12 is active. But your service is still serving traffic from revision 10. No new containers launched. No error. Just silence.
This is one of the most confusing ECS gotchas — and the fix is simple once you understand how ECS services track task definitions.
Why ECS Services Ignore New Revisions
When you create an ECS service, it pins to a specific task definition ARN, not just the family name. Registering a new revision does not automatically update running services. The service keeps using whatever ARN it was created or last updated with.
Check what your service is currently pinned to:
aws ecs describe-services \
--cluster my-cluster \
--services my-service \
--query "services[0].taskDefinition"Output:
"arn:aws:ecs:ap-south-1:123456789:task-definition/my-app:10"
It's pinned to revision 10. Your new revision 12 exists but the service doesn't know about it.
Fix 1: Update the Service to Point to New Revision
Explicitly tell the service to use the new revision:
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--task-definition my-app:12Or always use LATEST (not recommended for production — can cause unexpected rollouts):
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--task-definition my-appWhen you omit the revision number, ECS resolves to the latest at update time and pins to that ARN.
Fix 2: Force New Deployment Without Changing Task Def
If you want ECS to replace running tasks with the same task definition (useful when you've pushed a new Docker image to the same tag), use:
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--force-new-deploymentThis triggers a deployment even if the task definition ARN hasn't changed. Common use case: you pushed a new image to my-app:latest and want ECS to pull it fresh.
Verify Which Revision Is Actually Running
Don't trust the service console alone. Check the running tasks:
# Get task ARNs
aws ecs list-tasks \
--cluster my-cluster \
--service-name my-service \
--query "taskArns"
# Describe tasks to see which revision is running
aws ecs describe-tasks \
--cluster my-cluster \
--tasks arn:aws:ecs:ap-south-1:123456789:task/abc123 \
--query "tasks[0].taskDefinitionArn"Expected output after a successful update:
"arn:aws:ecs:ap-south-1:123456789:task-definition/my-app:12"
Check Desired vs Running Count
If the update triggered but new tasks aren't starting:
aws ecs describe-services \
--cluster my-cluster \
--services my-service \
--query "services[0].{desired:desiredCount,running:runningCount,pending:pendingCount,deployments:deployments[*].{status:status,taskDef:taskDefinition,running:runningCount}}"Output:
{
"desired": 2,
"running": 1,
"pending": 1,
"deployments": [
{
"status": "PRIMARY",
"taskDef": "arn:aws:ecs:...my-app:12",
"running": 1
},
{
"status": "ACTIVE",
"taskDef": "arn:aws:ecs:...my-app:10",
"running": 1
}
]
}Two deployments active = rollout in progress. Wait for the ACTIVE deployment to drain to 0.
Deployment Circuit Breaker Blocking the Update
ECS has a deployment circuit breaker that stops a bad rollout and optionally rolls back. If your new task definition has a health check failure or the container crashes on start, the circuit breaker halts the deployment.
Check deployment failures:
aws ecs describe-services \
--cluster my-cluster \
--services my-service \
--query "services[0].deployments[0].{status:status,failedTasks:failedTasks,rolloutState:rolloutState,rolloutStateReason:rolloutStateReason}"Output when circuit breaker triggers:
{
"status": "PRIMARY",
"failedTasks": 3,
"rolloutState": "FAILED",
"rolloutStateReason": "ECS deployment circuit breaker: task failed to start"
}Dig into the stopped tasks to find the actual error:
aws ecs list-tasks \
--cluster my-cluster \
--service-name my-service \
--desired-status STOPPED \
--query "taskArns[0]"
aws ecs describe-tasks \
--cluster my-cluster \
--tasks <stopped-task-arn> \
--query "tasks[0].containers[0].{reason:reason,exitCode:exitCode,lastStatus:lastStatus}"Common reasons:
CannotPullContainerError— ECR image not found or IAM permissions missingEssential container exited— app crashed at startup (check CloudWatch Logs)ResourceInitializationError— secrets manager or SSM parameter not accessible
Common Gotcha: Updating Task Def in Terraform But Not the Service
In Terraform, if you update aws_ecs_task_definition but your aws_ecs_service has task_definition = aws_ecs_task_definition.app.arn, it should auto-update. But if you hardcoded the ARN or used ignore_changes, the service won't update.
Check your Terraform:
resource "aws_ecs_service" "app" {
name = "my-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn # This should be dynamic
desired_count = 2
deployment_circuit_breaker {
enable = true
rollback = true
}
}If rollback = true, a failed deployment auto-reverts to the previous task definition — which is why it looks like "nothing changed."
Quick Checklist
- Registered new task definition revision? Check with
aws ecs list-task-definitions --family-prefix my-app - Service pinned to old ARN? Run
describe-servicesand checktaskDefinition - Run
update-service --task-definition my-app:NEW_REVISION - Deployment stuck? Check
rolloutStateand stopped task reasons - Image same tag? Use
--force-new-deployment - Circuit breaker rolled back? Fix the underlying crash, then redeploy
ECS services never auto-pickup new revisions. Every revision bump requires an explicit service update. Build that step into your CI/CD pipeline using the AWS CLI or GitHub Actions aws-actions/amazon-ecs-deploy-task-definition action.
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS CodeBuild Build Failing or Timing Out — Fix Guide
CodeBuild exits with status 1, times out mid-build, or fails with cryptic phase errors. Here's how to diagnose DOWNLOAD_SOURCE, BUILD, and POST_BUILD failures with specific fixes.
GitHub Actions OIDC to AWS Not Working — Every Fix (2026)
Your GitHub Actions workflow can't authenticate to AWS using OIDC. You're getting 'Not authorized to perform sts:AssumeRoleWithWebIdentity' or token errors. Here's every cause and the exact fix for each one.
AWS ALB 504 Gateway Timeout — Every Cause and Fix (2026)
Your ALB returns 504 Gateway Timeout but the app seems fine. Here's every reason this happens — backend timeouts, keepalive mismatches, health check failures — and exactly how to fix each one.