Kubernetes StatefulSet Pods Stuck on Init or Pending: Fix It Now
StatefulSet pods stuck in Init, Pending, or CrashLoopBackOff state? Here are every cause and the exact fix — PVC ordering, headless service, and init container issues.
StatefulSets are harder to debug than Deployments. They have strict pod ordering, persistent volume claims per replica, and a headless service dependency. When a StatefulSet pod gets stuck, the cause is almost always one of five things.
Here's every scenario and the exact fix.
Why StatefulSets Are Different
Unlike Deployments, StatefulSets:
- Create pods in order: pod-0 must be Running before pod-1 starts
- Each pod gets its own PVC (
data-myapp-0,data-myapp-1) - Require a headless service for stable network identity
- Don't reschedule pods to different nodes — they stay bound to their PVC
This means one stuck pod blocks the entire StatefulSet.
Step 1 — Check Pod Status
kubectl get pods -n <namespace>
kubectl describe pod <statefulset-name>-0 -n <namespace>
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20Cause 1: PVC Stuck in Pending
Symptom: Pod in Pending state. Events show waiting for a volume to be created
kubectl get pvc -n <namespace>
# You'll see: data-myapp-0 PendingCauses:
- No StorageClass defined
- StorageClass provisioner not available
- Wrong StorageClass name in the StatefulSet spec
Fix:
# Check available StorageClasses
kubectl get storageclass
# Check if default StorageClass exists
kubectl get storageclass | grep default
# Fix: annotate a StorageClass as default
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'If you're using a specific StorageClass in the StatefulSet:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2" # must match exactly
resources:
requests:
storage: 10GiCause 2: Headless Service Missing or Misconfigured
StatefulSets need a headless service (clusterIP: None) for DNS-based pod discovery.
Symptom: Pod stuck in Pending or DNS resolution fails inside pods
kubectl get svc -n <namespace>
# Check if headless service exists with clusterIP: NoneFix — Create the headless service:
apiVersion: v1
kind: Service
metadata:
name: myapp-headless
namespace: <namespace>
spec:
clusterIP: None # This is what makes it headless
selector:
app: myapp
ports:
- port: 80
name: webIn your StatefulSet, reference it:
spec:
serviceName: "myapp-headless" # must match the service name
replicas: 3Cause 3: Pod Ordering — Previous Pod Not Ready
Symptom: pod-1 stuck in Pending while pod-0 is Running but 0/1 Ready
StatefulSet won't start pod-1 until pod-0 passes its readiness probe.
kubectl describe pod myapp-0 -n <namespace>
# Look for: Readiness probe failedFix — Check and fix readiness probe:
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30 # Increase if app starts slowly
periodSeconds: 10
failureThreshold: 5Or temporarily patch to use a more lenient probe:
kubectl patch statefulset myapp -n <namespace> --type=json \
-p='[{"op":"replace","path":"/spec/template/spec/containers/0/readinessProbe/initialDelaySeconds","value":60}]'Cause 4: Init Container Failing
Symptom: Pod stuck in Init:0/1 or Init:CrashLoopBackOff
kubectl logs myapp-0 -c <init-container-name> -n <namespace>
kubectl describe pod myapp-0 -n <namespace> | grep -A 20 "Init Containers"Common init container failures:
- Database not ready yet (init waits for DB connection)
- Wrong environment variable
- ConfigMap or Secret not found
Fix — Add a proper wait loop in init container:
initContainers:
- name: wait-for-db
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z postgres-svc 5432; do
echo "Waiting for postgres..."
sleep 2
done
echo "Postgres ready!"Cause 5: PVC Already Bound to a Deleted Pod on a Different Node
This is the trickiest one. When you delete and recreate a StatefulSet, old PVCs might be bound to PVs on a specific node. If the pod gets scheduled to a different node, the volume can't attach.
Symptom: Multi-Attach error for volume — volume is already exclusively attached to one node
kubectl describe pod myapp-0 -n <namespace> | grep -A 5 "Warning"
# Output: Multi-Attach error for volume "pvc-abc123"Fix — Force pod to the correct node:
# Find which node the PV is on
kubectl get pv | grep myapp
# Get node affinity of the PV
kubectl describe pv <pv-name> | grep -A 5 "Node Affinity"
# Add node affinity to the StatefulSet to matchOr delete the stuck pod and let K8s reschedule it (sometimes resolves automatically):
kubectl delete pod myapp-0 -n <namespace>Full Diagnostic Script
Run this to get the full picture in one shot:
NS=<your-namespace>
SS=<statefulset-name>
echo "=== StatefulSet Status ==="
kubectl get statefulset $SS -n $NS
echo "=== Pods ==="
kubectl get pods -l app=$SS -n $NS -o wide
echo "=== PVCs ==="
kubectl get pvc -n $NS | grep $SS
echo "=== Recent Events ==="
kubectl get events -n $NS --sort-by='.lastTimestamp' | tail -15
echo "=== Pod-0 Description ==="
kubectl describe pod ${SS}-0 -n $NS | tail -30Prevention Tips
- Always define resource
requestsandlimits— prevents OOM killing during init - Set generous
initialDelaySecondson readiness probes for stateful apps - Use
podManagementPolicy: Parallelif pod ordering doesn't matter for your app — faster scaling - Use
Retainreclaim policy on PVs in production — prevents data loss on accidental PVC delete
If you want hands-on practice with StatefulSets and storage, KodeKloud's CKA course has dedicated labs where you debug exactly these scenarios on live clusters.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS EKS Pods Stuck in Pending State: Causes and Fixes
Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.
AWS EKS Worker Nodes Not Joining the Cluster: Complete Fix Guide
EKS worker nodes stuck in NotReady or not appearing at all? Here are all the causes and step-by-step fixes for node bootstrap failures.
AWS RDS Connection Timeout from EKS Pods — How to Fix It
EKS pods can't connect to RDS? Fix RDS connection timeouts from Kubernetes — covers security groups, VPC peering, subnet routing, and IAM auth issues.