Why Your Docker Container Keeps Restarting (and How to Fix It)
CrashLoopBackOff, OOMKilled, exit code 1, exit code 137 — Docker containers restart for specific, diagnosable reasons. Here is how to identify the exact cause and fix it in minutes.
You deploy your container. It starts. Then it restarts. Then it restarts again. If you've been in DevOps for more than a week, you've seen this. The frustrating part isn't that containers crash — it's that the error messages feel cryptic until you know what they actually mean.
The good news: every restart has a reason, and every reason is diagnosable. This guide breaks down the most common causes, shows you exactly how to find which one is hitting you, and gives you the precise fix for each.
Why Containers Restart in the First Place
Docker containers are designed to run a single process. When that process exits — for any reason — the container stops. If your restart policy is set to always or on-failure, Docker immediately tries again. If the process keeps failing, you get an infinite restart loop.
In Kubernetes, this surfaces as CrashLoopBackOff, which adds exponential backoff delays between restarts (starting at 10 seconds, doubling each time up to 5 minutes). It's Kubernetes protecting the cluster from a runaway process hammering resources.
Understanding the exit code is your first diagnostic step. That number tells you a lot.
Exit Code Cheat Sheet
| Exit Code | Meaning |
|---|---|
0 | Clean exit — process finished successfully (should not restart unless restart policy forces it) |
1 | General application error — your app crashed |
137 | OOMKilled — container was killed by the OS (out of memory) |
139 | Segmentation fault |
143 | SIGTERM received — graceful shutdown requested but timed out |
Check the exit code with:
docker inspect <container_id> --format='{{.State.ExitCode}}'Cause 1: Application Crash (Exit Code 1)
This is the most common cause. Your application threw an unhandled exception, couldn't find a required file, or encountered a startup error.
How to diagnose:
# For Docker
docker logs <container_name>
# For Kubernetes — check current logs
kubectl logs <pod-name> -n <namespace>
# CRITICAL: Check logs from the PREVIOUS crashed container
kubectl logs <pod-name> -n <namespace> --previousThe --previous flag is the one people forget. When a pod is in CrashLoopBackOff, kubectl logs shows the current (empty) container. You need --previous to see what actually went wrong.
What to look for: Stack traces, Error: Cannot find module, Connection refused, ENOENT, database connection failures.
Fix: Read the error, fix the code or configuration. Most of the time it's a missing env var or a failed dependency connection on startup.
Cause 2: Missing Environment Variables
Your app expects DATABASE_URL or API_KEY. You forgot to set it. The app throws an error on boot and exits with code 1.
How to diagnose:
# See what env vars the running container has
docker exec <container> env
# Or inspect
docker inspect <container> --format='{{.Config.Env}}'In Kubernetes:
kubectl describe pod <pod-name> -n <namespace>Look at the Environment: section in the output.
Fix: Add the missing variable to your docker run -e flags, docker-compose.yml env section, or Kubernetes env / envFrom fields in your deployment manifest.
# Kubernetes example
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-urlCause 3: OOMKilled — Exit Code 137
Your container ran out of memory. The Linux kernel OOM killer stepped in and killed the process with SIGKILL. This is exit code 137 (128 + 9).
This is sneaky because the container looks like it just crashed — the logs often show nothing unusual right before the kill.
How to diagnose:
# Docker — look for OOMKilled: true
docker inspect <container> --format='{{.State.OOMKilled}}'
# Kubernetes
kubectl describe pod <pod-name> -n <namespace>In the kubectl describe output, look for:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Fix: You have two options. First, increase the memory limit in your container spec. Second (and better long-term), profile your application to find the memory leak.
# Kubernetes resource limits
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"Start with a reasonable limit and monitor actual usage with:
kubectl top pod <pod-name> -n <namespace>Cause 4: Wrong Entrypoint or CMD
Your Dockerfile has a CMD that points to a script that doesn't exist, has wrong permissions, or has Windows-style line endings (\r\n) that Linux can't execute.
How to diagnose:
# Run the container interactively to test
docker run -it --entrypoint /bin/sh <image-name>
# Then manually run your entrypoint script
/app/start.shThis immediately reveals permission errors or "file not found" issues.
Common culprits:
# Wrong line endings (Windows developers, this is usually you)
file start.sh
# Output: start.sh: ASCII text, with CRLF line terminators ← problem
# Fix line endings
sed -i 's/\r//' start.sh
# Wrong permissions
chmod +x start.shCause 5: Health Check Failures
If you've configured a health check and it consistently fails, Docker will mark the container as unhealthy. In Kubernetes, a failing livenessProbe triggers automatic restarts.
How to diagnose:
# Docker health status
docker inspect <container> --format='{{.State.Health.Status}}'
docker inspect <container> --format='{{json .State.Health.Log}}' | jqIn Kubernetes:
kubectl describe pod <pod-name>Look for events like:
Liveness probe failed: HTTP probe failed with statuscode: 503
Fix: Test your health check endpoint manually first:
curl -v http://localhost:8080/healthIf the endpoint isn't ready when the probe fires, increase initialDelaySeconds:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30 # Give app time to boot
periodSeconds: 10
failureThreshold: 3Cause 6: CrashLoopBackOff in Kubernetes
CrashLoopBackOff isn't a cause — it's a symptom. Kubernetes applies it when a container keeps failing. But you can get more detail:
# Full event history for the pod
kubectl describe pod <pod-name> -n <namespace>
# Events at the namespace level
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
# Check which container in a multi-container pod is crashing
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.containerStatuses[*].name}'The Events: section at the bottom of kubectl describe is gold. It tells you why the container is being killed.
Debugging Checklist
When a container keeps restarting, run through this in order:
- Get the exit code —
docker inspectorkubectl describe - Read the previous logs —
kubectl logs --previousordocker logs - Check for OOMKilled —
docker inspectforOOMKilled: true - Verify environment variables — are all required vars present?
- Test the entrypoint manually —
docker run -it --entrypoint /bin/sh - Check health probe timing — increase
initialDelaySecondsif app is slow to boot - Check resource limits — are CPU/memory limits too aggressive?
- Look at Kubernetes events —
kubectl get events --sort-by='.lastTimestamp'
Most container restarts are solved in steps 1–4. The rest cover the edge cases.
Keep Learning
Debugging containers is a core skill that separates junior engineers from senior ones. If you want to go deeper on Docker, Kubernetes troubleshooting, and production-grade container workflows, KodeKloud has hands-on labs that let you break things in a safe environment and practice fixing them — exactly the kind of muscle memory that makes these diagnoses second nature.
The next time a container restarts, you won't panic. You'll reach for kubectl logs --previous and know exactly what you're looking for.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Kubernetes OOMKilled: How I Fixed Out of Memory Errors in Production
OOMKilled crashes killing your pods? Here's the real cause, how to diagnose it fast, and the exact steps to fix it without breaking production.
Kubernetes Pod Stuck in Pending State: Every Cause and Fix (2026)
Your pod says Pending and nothing is happening. Here's how to diagnose every possible reason — insufficient resources, taints, PVC issues, node selectors — and fix them fast.
Kubernetes Troubleshooting Guide 2026: Fix Every Common Problem
The most complete Kubernetes troubleshooting guide for 2026. Learn how to diagnose and fix Pod crashes, ImagePullBackOff, OOMKilled, CrashLoopBackOff, networking issues, PVC problems, node NotReady, and more — with exact kubectl commands.