🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Kubernetes StatefulSet Pods Stuck on Init or Pending: Fix It Now

StatefulSet pods stuck in Init, Pending, or CrashLoopBackOff state? Here are every cause and the exact fix — PVC ordering, headless service, and init container issues.

DevOpsBoys4 min read
Share:Tweet

StatefulSets are harder to debug than Deployments. They have strict pod ordering, persistent volume claims per replica, and a headless service dependency. When a StatefulSet pod gets stuck, the cause is almost always one of five things.

Here's every scenario and the exact fix.


Why StatefulSets Are Different

Unlike Deployments, StatefulSets:

  • Create pods in order: pod-0 must be Running before pod-1 starts
  • Each pod gets its own PVC (data-myapp-0, data-myapp-1)
  • Require a headless service for stable network identity
  • Don't reschedule pods to different nodes — they stay bound to their PVC

This means one stuck pod blocks the entire StatefulSet.


Step 1 — Check Pod Status

bash
kubectl get pods -n <namespace>
kubectl describe pod <statefulset-name>-0 -n <namespace>
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

Cause 1: PVC Stuck in Pending

Symptom: Pod in Pending state. Events show waiting for a volume to be created

bash
kubectl get pvc -n <namespace>
# You'll see: data-myapp-0   Pending

Causes:

  • No StorageClass defined
  • StorageClass provisioner not available
  • Wrong StorageClass name in the StatefulSet spec

Fix:

bash
# Check available StorageClasses
kubectl get storageclass
 
# Check if default StorageClass exists
kubectl get storageclass | grep default
 
# Fix: annotate a StorageClass as default
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

If you're using a specific StorageClass in the StatefulSet:

yaml
volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "gp2"  # must match exactly
      resources:
        requests:
          storage: 10Gi

Cause 2: Headless Service Missing or Misconfigured

StatefulSets need a headless service (clusterIP: None) for DNS-based pod discovery.

Symptom: Pod stuck in Pending or DNS resolution fails inside pods

bash
kubectl get svc -n <namespace>
# Check if headless service exists with clusterIP: None

Fix — Create the headless service:

yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-headless
  namespace: <namespace>
spec:
  clusterIP: None          # This is what makes it headless
  selector:
    app: myapp
  ports:
    - port: 80
      name: web

In your StatefulSet, reference it:

yaml
spec:
  serviceName: "myapp-headless"   # must match the service name
  replicas: 3

Cause 3: Pod Ordering — Previous Pod Not Ready

Symptom: pod-1 stuck in Pending while pod-0 is Running but 0/1 Ready

StatefulSet won't start pod-1 until pod-0 passes its readiness probe.

bash
kubectl describe pod myapp-0 -n <namespace>
# Look for: Readiness probe failed

Fix — Check and fix readiness probe:

yaml
readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30   # Increase if app starts slowly
  periodSeconds: 10
  failureThreshold: 5

Or temporarily patch to use a more lenient probe:

bash
kubectl patch statefulset myapp -n <namespace> --type=json \
  -p='[{"op":"replace","path":"/spec/template/spec/containers/0/readinessProbe/initialDelaySeconds","value":60}]'

Cause 4: Init Container Failing

Symptom: Pod stuck in Init:0/1 or Init:CrashLoopBackOff

bash
kubectl logs myapp-0 -c <init-container-name> -n <namespace>
kubectl describe pod myapp-0 -n <namespace> | grep -A 20 "Init Containers"

Common init container failures:

  • Database not ready yet (init waits for DB connection)
  • Wrong environment variable
  • ConfigMap or Secret not found

Fix — Add a proper wait loop in init container:

yaml
initContainers:
  - name: wait-for-db
    image: busybox:1.36
    command:
      - sh
      - -c
      - |
        until nc -z postgres-svc 5432; do
          echo "Waiting for postgres..."
          sleep 2
        done
        echo "Postgres ready!"

Cause 5: PVC Already Bound to a Deleted Pod on a Different Node

This is the trickiest one. When you delete and recreate a StatefulSet, old PVCs might be bound to PVs on a specific node. If the pod gets scheduled to a different node, the volume can't attach.

Symptom: Multi-Attach error for volume — volume is already exclusively attached to one node

bash
kubectl describe pod myapp-0 -n <namespace> | grep -A 5 "Warning"
# Output: Multi-Attach error for volume "pvc-abc123"

Fix — Force pod to the correct node:

bash
# Find which node the PV is on
kubectl get pv | grep myapp
 
# Get node affinity of the PV
kubectl describe pv <pv-name> | grep -A 5 "Node Affinity"
 
# Add node affinity to the StatefulSet to match

Or delete the stuck pod and let K8s reschedule it (sometimes resolves automatically):

bash
kubectl delete pod myapp-0 -n <namespace>

Full Diagnostic Script

Run this to get the full picture in one shot:

bash
NS=<your-namespace>
SS=<statefulset-name>
 
echo "=== StatefulSet Status ==="
kubectl get statefulset $SS -n $NS
 
echo "=== Pods ==="
kubectl get pods -l app=$SS -n $NS -o wide
 
echo "=== PVCs ==="
kubectl get pvc -n $NS | grep $SS
 
echo "=== Recent Events ==="
kubectl get events -n $NS --sort-by='.lastTimestamp' | tail -15
 
echo "=== Pod-0 Description ==="
kubectl describe pod ${SS}-0 -n $NS | tail -30

Prevention Tips

  1. Always define resource requests and limits — prevents OOM killing during init
  2. Set generous initialDelaySeconds on readiness probes for stateful apps
  3. Use podManagementPolicy: Parallel if pod ordering doesn't matter for your app — faster scaling
  4. Use Retain reclaim policy on PVs in production — prevents data loss on accidental PVC delete

If you want hands-on practice with StatefulSets and storage, KodeKloud's CKA course has dedicated labs where you debug exactly these scenarios on live clusters.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments