Kubernetes CrashLoopBackOff After Changing Resource Limits — Fix
Your pod worked fine before you updated resource limits, now it's in CrashLoopBackOff. Here's exactly why this happens and how to fix it without downtime.
You updated resource limits on a running deployment — lowered CPU or memory to save costs — and now the pod is stuck in CrashLoopBackOff. The app was fine before. Nothing else changed.
Here's every reason this happens and exactly how to fix it.
Why Resource Limit Changes Cause CrashLoopBackOff
When you change resource limits, Kubernetes kills and restarts affected pods with the new constraints. If the new limits are too low, the container crashes immediately — triggering the backoff loop.
Three things kill your container after a limit change:
1. OOMKilled — Memory limit is too low, kernel kills the process
2. CPU throttling so severe the app times out — Liveness probe fails, container restarts
3. JVM / runtime startup fails — Java apps need memory headroom at startup, not just steady state
Diagnose First
# Check why pod is crashing
kubectl describe pod <pod-name> -n <namespace>
# Look for these in the output:
# State: Terminated
# Reason: OOMKilled ← memory too low
# Exit Code: 137 ← OOMKilled
# Exit Code: 1 ← app crashed (check logs)
# Check actual resource usage before crash
kubectl top pod <pod-name> -n <namespace>
# Check logs from the PREVIOUS container run
kubectl logs <pod-name> -n <namespace> --previousCase 1: OOMKilled (Most Common)
kubectl describe pod my-app-xxx -n production
# ...
# Last State: Terminated
# Reason: OOMKilled
# Exit Code: 137Fix: Increase memory limit. Rule of thumb — set limit at 2x your observed peak usage.
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi" # was 256Mi — doubled it
cpu: "500m"For Java apps, memory usage at startup can be 3–4x steady state. Always check startup memory separately:
# Watch memory during startup
kubectl exec -it <pod> -- cat /proc/meminfo | grep MemAvailableCase 2: Liveness Probe Killing Container Due to CPU Throttle
With low CPU limits, the app starts slowly. The liveness probe fires before the app is ready and Kubernetes kills it — starting the loop.
kubectl describe pod my-app-xxx
# Warning Unhealthy Liveness probe failed: Get "http://...": context deadline exceededFix option 1: Increase CPU limit temporarily to diagnose:
resources:
limits:
cpu: "1000m" # temporarily increasedFix option 2: Increase liveness probe initialDelaySeconds:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60 # was 10 — give app time to start
periodSeconds: 10
failureThreshold: 3Case 3: App Genuinely Needs More Resources
Check what the app actually used before you changed limits:
# If you have Prometheus/Grafana — check historical metrics
# If not, check node metrics
kubectl top nodes
kubectl top pods -n production --sort-by=memoryFind the right baseline and set limits accordingly:
resources:
requests:
memory: "128Mi" # what app uses at idle
cpu: "50m"
limits:
memory: "384Mi" # 3x request — gives headroom for spikes
cpu: "300m"Rollback If You're in a Rush
If production is down, rollback immediately:
# Rollback to previous deployment revision
kubectl rollout undo deployment/my-app -n production
# Verify rollback
kubectl rollout status deployment/my-app -n productionThen fix the limits properly before re-deploying.
Set VPA to Find the Right Limits Automatically
Instead of guessing, use the Vertical Pod Autoscaler in recommendation mode:
# Install VPA
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/vertical-pod-autoscaler.yaml
# Create VPA object in Off mode (recommendation only)
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # Only recommend, don't auto-apply
EOF
# After 24 hours, check recommendations
kubectl describe vpa my-app-vpa
# Containers:
# Container recommendations:
# Container Name: my-app
# Lower Bound: memory: 128Mi, cpu: 50m
# Target: memory: 256Mi, cpu: 100m
# Upper Bound: memory: 512Mi, cpu: 200mUse the Target values as your requests and Upper Bound as your limits.
Quick Reference
| Exit Code | Meaning | Fix |
|---|---|---|
| 137 | OOMKilled | Increase memory limit |
| 143 | SIGTERM timeout | Check prestop hook |
| 1 | App crash | Check kubectl logs --previous |
| 0 | App exited cleanly | Fix liveness/readiness probe |
Always test limit changes in staging first. A staged rollout (maxUnavailable: 1) means one pod at a time gets the new limits — if it crashes, the rest stay up.
Practice Kubernetes troubleshooting in a real cluster — KodeKloud has hands-on labs where you fix broken pods under time pressure, exactly like production incidents.
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
ArgoCD App of Apps Not Syncing — Every Fix (2026)
Your ArgoCD App of Apps pattern stopped syncing. Child apps aren't created, parent shows OutOfSync, or sync is stuck. Here are every cause and the exact fix.
ArgoCD Image Updater Not Syncing — Fix Guide
ArgoCD Image Updater detects a new image tag but doesn't update the Application. Here's how to diagnose and fix annotation errors, registry auth issues, write-back problems, and sync failures.
AWS EKS Cluster Autoscaler Not Scaling — Every Fix (2026)
Your EKS Cluster Autoscaler isn't scaling up, scale-down isn't working, or nodes spin up but stay empty. Here's every cause and the exact fix.