🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Kubernetes CrashLoopBackOff After Changing Resource Limits — Fix

Your pod worked fine before you updated resource limits, now it's in CrashLoopBackOff. Here's exactly why this happens and how to fix it without downtime.

DevOpsBoysMay 28, 20264 min read
Share:Tweet

You updated resource limits on a running deployment — lowered CPU or memory to save costs — and now the pod is stuck in CrashLoopBackOff. The app was fine before. Nothing else changed.

Here's every reason this happens and exactly how to fix it.


Why Resource Limit Changes Cause CrashLoopBackOff

When you change resource limits, Kubernetes kills and restarts affected pods with the new constraints. If the new limits are too low, the container crashes immediately — triggering the backoff loop.

Three things kill your container after a limit change:

1. OOMKilled — Memory limit is too low, kernel kills the process
2. CPU throttling so severe the app times out — Liveness probe fails, container restarts
3. JVM / runtime startup fails — Java apps need memory headroom at startup, not just steady state


Diagnose First

bash
# Check why pod is crashing
kubectl describe pod <pod-name> -n <namespace>
 
# Look for these in the output:
# State: Terminated
# Reason: OOMKilled       ← memory too low
# Exit Code: 137          ← OOMKilled
# Exit Code: 1            ← app crashed (check logs)
 
# Check actual resource usage before crash
kubectl top pod <pod-name> -n <namespace>
 
# Check logs from the PREVIOUS container run
kubectl logs <pod-name> -n <namespace> --previous

Case 1: OOMKilled (Most Common)

bash
kubectl describe pod my-app-xxx -n production
# ...
# Last State: Terminated
#   Reason: OOMKilled
#   Exit Code: 137

Fix: Increase memory limit. Rule of thumb — set limit at 2x your observed peak usage.

yaml
resources:
  requests:
    memory: "256Mi"
    cpu: "100m"
  limits:
    memory: "512Mi"   # was 256Mi — doubled it
    cpu: "500m"

For Java apps, memory usage at startup can be 3–4x steady state. Always check startup memory separately:

bash
# Watch memory during startup
kubectl exec -it <pod> -- cat /proc/meminfo | grep MemAvailable

Case 2: Liveness Probe Killing Container Due to CPU Throttle

With low CPU limits, the app starts slowly. The liveness probe fires before the app is ready and Kubernetes kills it — starting the loop.

bash
kubectl describe pod my-app-xxx
# Warning  Unhealthy  Liveness probe failed: Get "http://...": context deadline exceeded

Fix option 1: Increase CPU limit temporarily to diagnose:

yaml
resources:
  limits:
    cpu: "1000m"   # temporarily increased

Fix option 2: Increase liveness probe initialDelaySeconds:

yaml
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 60   # was 10 — give app time to start
  periodSeconds: 10
  failureThreshold: 3

Case 3: App Genuinely Needs More Resources

Check what the app actually used before you changed limits:

bash
# If you have Prometheus/Grafana — check historical metrics
# If not, check node metrics
kubectl top nodes
kubectl top pods -n production --sort-by=memory

Find the right baseline and set limits accordingly:

yaml
resources:
  requests:
    memory: "128Mi"    # what app uses at idle
    cpu: "50m"
  limits:
    memory: "384Mi"    # 3x request — gives headroom for spikes
    cpu: "300m"

Rollback If You're in a Rush

If production is down, rollback immediately:

bash
# Rollback to previous deployment revision
kubectl rollout undo deployment/my-app -n production
 
# Verify rollback
kubectl rollout status deployment/my-app -n production

Then fix the limits properly before re-deploying.


Set VPA to Find the Right Limits Automatically

Instead of guessing, use the Vertical Pod Autoscaler in recommendation mode:

bash
# Install VPA
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/vertical-pod-autoscaler.yaml
 
# Create VPA object in Off mode (recommendation only)
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off"   # Only recommend, don't auto-apply
EOF
 
# After 24 hours, check recommendations
kubectl describe vpa my-app-vpa
# Containers:
#   Container recommendations:
#     Container Name: my-app
#     Lower Bound:  memory: 128Mi, cpu: 50m
#     Target:       memory: 256Mi, cpu: 100m
#     Upper Bound:  memory: 512Mi, cpu: 200m

Use the Target values as your requests and Upper Bound as your limits.


Quick Reference

Exit CodeMeaningFix
137OOMKilledIncrease memory limit
143SIGTERM timeoutCheck prestop hook
1App crashCheck kubectl logs --previous
0App exited cleanlyFix liveness/readiness probe

Always test limit changes in staging first. A staged rollout (maxUnavailable: 1) means one pod at a time gets the new limits — if it crashes, the rest stay up.

Practice Kubernetes troubleshooting in a real cluster — KodeKloud has hands-on labs where you fix broken pods under time pressure, exactly like production incidents.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments