🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Kubernetes StatefulSet Pods Stuck on Init or Pending: Fix It Now

StatefulSet pods stuck in Init, Pending, or CrashLoopBackOff state? Here are every cause and the exact fix — PVC ordering, headless service, and init container issues.

DevOpsBoysMay 10, 20264 min read
Share:Tweet

StatefulSets are harder to debug than Deployments. They have strict pod ordering, persistent volume claims per replica, and a headless service dependency. When a StatefulSet pod gets stuck, the cause is almost always one of five things.

Here's every scenario and the exact fix.


Why StatefulSets Are Different

Unlike Deployments, StatefulSets:

  • Create pods in order: pod-0 must be Running before pod-1 starts
  • Each pod gets its own PVC (data-myapp-0, data-myapp-1)
  • Require a headless service for stable network identity
  • Don't reschedule pods to different nodes — they stay bound to their PVC

This means one stuck pod blocks the entire StatefulSet.


Step 1 — Check Pod Status

bash
kubectl get pods -n <namespace>
kubectl describe pod <statefulset-name>-0 -n <namespace>
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

Cause 1: PVC Stuck in Pending

Symptom: Pod in Pending state. Events show waiting for a volume to be created

bash
kubectl get pvc -n <namespace>
# You'll see: data-myapp-0   Pending

Causes:

  • No StorageClass defined
  • StorageClass provisioner not available
  • Wrong StorageClass name in the StatefulSet spec

Fix:

bash
# Check available StorageClasses
kubectl get storageclass
 
# Check if default StorageClass exists
kubectl get storageclass | grep default
 
# Fix: annotate a StorageClass as default
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

If you're using a specific StorageClass in the StatefulSet:

yaml
volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "gp2"  # must match exactly
      resources:
        requests:
          storage: 10Gi

Cause 2: Headless Service Missing or Misconfigured

StatefulSets need a headless service (clusterIP: None) for DNS-based pod discovery.

Symptom: Pod stuck in Pending or DNS resolution fails inside pods

bash
kubectl get svc -n <namespace>
# Check if headless service exists with clusterIP: None

Fix — Create the headless service:

yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-headless
  namespace: <namespace>
spec:
  clusterIP: None          # This is what makes it headless
  selector:
    app: myapp
  ports:
    - port: 80
      name: web

In your StatefulSet, reference it:

yaml
spec:
  serviceName: "myapp-headless"   # must match the service name
  replicas: 3

Cause 3: Pod Ordering — Previous Pod Not Ready

Symptom: pod-1 stuck in Pending while pod-0 is Running but 0/1 Ready

StatefulSet won't start pod-1 until pod-0 passes its readiness probe.

bash
kubectl describe pod myapp-0 -n <namespace>
# Look for: Readiness probe failed

Fix — Check and fix readiness probe:

yaml
readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30   # Increase if app starts slowly
  periodSeconds: 10
  failureThreshold: 5

Or temporarily patch to use a more lenient probe:

bash
kubectl patch statefulset myapp -n <namespace> --type=json \
  -p='[{"op":"replace","path":"/spec/template/spec/containers/0/readinessProbe/initialDelaySeconds","value":60}]'

Cause 4: Init Container Failing

Symptom: Pod stuck in Init:0/1 or Init:CrashLoopBackOff

bash
kubectl logs myapp-0 -c <init-container-name> -n <namespace>
kubectl describe pod myapp-0 -n <namespace> | grep -A 20 "Init Containers"

Common init container failures:

  • Database not ready yet (init waits for DB connection)
  • Wrong environment variable
  • ConfigMap or Secret not found

Fix — Add a proper wait loop in init container:

yaml
initContainers:
  - name: wait-for-db
    image: busybox:1.36
    command:
      - sh
      - -c
      - |
        until nc -z postgres-svc 5432; do
          echo "Waiting for postgres..."
          sleep 2
        done
        echo "Postgres ready!"

Cause 5: PVC Already Bound to a Deleted Pod on a Different Node

This is the trickiest one. When you delete and recreate a StatefulSet, old PVCs might be bound to PVs on a specific node. If the pod gets scheduled to a different node, the volume can't attach.

Symptom: Multi-Attach error for volume — volume is already exclusively attached to one node

bash
kubectl describe pod myapp-0 -n <namespace> | grep -A 5 "Warning"
# Output: Multi-Attach error for volume "pvc-abc123"

Fix — Force pod to the correct node:

bash
# Find which node the PV is on
kubectl get pv | grep myapp
 
# Get node affinity of the PV
kubectl describe pv <pv-name> | grep -A 5 "Node Affinity"
 
# Add node affinity to the StatefulSet to match

Or delete the stuck pod and let K8s reschedule it (sometimes resolves automatically):

bash
kubectl delete pod myapp-0 -n <namespace>

Full Diagnostic Script

Run this to get the full picture in one shot:

bash
NS=<your-namespace>
SS=<statefulset-name>
 
echo "=== StatefulSet Status ==="
kubectl get statefulset $SS -n $NS
 
echo "=== Pods ==="
kubectl get pods -l app=$SS -n $NS -o wide
 
echo "=== PVCs ==="
kubectl get pvc -n $NS | grep $SS
 
echo "=== Recent Events ==="
kubectl get events -n $NS --sort-by='.lastTimestamp' | tail -15
 
echo "=== Pod-0 Description ==="
kubectl describe pod ${SS}-0 -n $NS | tail -30

Prevention Tips

  1. Always define resource requests and limits — prevents OOM killing during init
  2. Set generous initialDelaySeconds on readiness probes for stateful apps
  3. Use podManagementPolicy: Parallel if pod ordering doesn't matter for your app — faster scaling
  4. Use Retain reclaim policy on PVs in production — prevents data loss on accidental PVC delete

If you want hands-on practice with StatefulSets and storage, KodeKloud's CKA course has dedicated labs where you debug exactly these scenarios on live clusters.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments