Kubernetes StatefulSet Pods Stuck on Init or Pending: Fix It Now

StatefulSet pods stuck in Init, Pending, or CrashLoopBackOff state? Here are every cause and the exact fix — PVC ordering, headless service, and init container issues.

StatefulSets are harder to debug than Deployments. They have strict pod ordering, persistent volume claims per replica, and a headless service dependency. When a StatefulSet pod gets stuck, the cause is almost always one of five things.

Here's every scenario and the exact fix.

Why StatefulSets Are Different

Unlike Deployments, StatefulSets:

Create pods in order: pod-0 must be Running before pod-1 starts
Each pod gets its own PVC (data-myapp-0, data-myapp-1)
Require a headless service for stable network identity
Don't reschedule pods to different nodes — they stay bound to their PVC

This means one stuck pod blocks the entire StatefulSet.

Step 1 — Check Pod Status

bash

kubectl get pods -n <namespace>
kubectl describe pod <statefulset-name>-0 -n <namespace>
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

Cause 1: PVC Stuck in Pending

Symptom: Pod in Pending state. Events show waiting for a volume to be created

bash

kubectl get pvc -n <namespace>
# You'll see: data-myapp-0   Pending

Causes:

No StorageClass defined
StorageClass provisioner not available
Wrong StorageClass name in the StatefulSet spec

Fix:

bash

# Check available StorageClasses
kubectl get storageclass
 
# Check if default StorageClass exists
kubectl get storageclass | grep default
 
# Fix: annotate a StorageClass as default
kubectl patch storageclass <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

If you're using a specific StorageClass in the StatefulSet:

yaml

volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "gp2"  # must match exactly
      resources:
        requests:
          storage: 10Gi

Cause 2: Headless Service Missing or Misconfigured

StatefulSets need a headless service (clusterIP: None) for DNS-based pod discovery.

Symptom: Pod stuck in Pending or DNS resolution fails inside pods

bash

kubectl get svc -n <namespace>
# Check if headless service exists with clusterIP: None

Fix — Create the headless service:

yaml

apiVersion: v1
kind: Service
metadata:
  name: myapp-headless
  namespace: <namespace>
spec:
  clusterIP: None          # This is what makes it headless
  selector:
    app: myapp
  ports:
    - port: 80
      name: web

In your StatefulSet, reference it:

yaml

spec:
  serviceName: "myapp-headless"   # must match the service name
  replicas: 3

Cause 3: Pod Ordering — Previous Pod Not Ready

Symptom: pod-1 stuck in Pending while pod-0 is Running but 0/1 Ready

StatefulSet won't start pod-1 until pod-0 passes its readiness probe.

bash

kubectl describe pod myapp-0 -n <namespace>
# Look for: Readiness probe failed

Fix — Check and fix readiness probe:

yaml

readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30   # Increase if app starts slowly
  periodSeconds: 10
  failureThreshold: 5

Or temporarily patch to use a more lenient probe:

bash

kubectl patch statefulset myapp -n <namespace> --type=json \
  -p='[{"op":"replace","path":"/spec/template/spec/containers/0/readinessProbe/initialDelaySeconds","value":60}]'

Cause 4: Init Container Failing

Symptom: Pod stuck in Init:0/1 or Init:CrashLoopBackOff

bash

kubectl logs myapp-0 -c <init-container-name> -n <namespace>
kubectl describe pod myapp-0 -n <namespace> | grep -A 20 "Init Containers"

Common init container failures:

Database not ready yet (init waits for DB connection)
Wrong environment variable
ConfigMap or Secret not found

Fix — Add a proper wait loop in init container:

yaml

initContainers:
  - name: wait-for-db
    image: busybox:1.36
    command:
      - sh
      - -c
      - |
        until nc -z postgres-svc 5432; do
          echo "Waiting for postgres..."
          sleep 2
        done
        echo "Postgres ready!"

Cause 5: PVC Already Bound to a Deleted Pod on a Different Node

This is the trickiest one. When you delete and recreate a StatefulSet, old PVCs might be bound to PVs on a specific node. If the pod gets scheduled to a different node, the volume can't attach.

Symptom: Multi-Attach error for volume — volume is already exclusively attached to one node

bash

kubectl describe pod myapp-0 -n <namespace> | grep -A 5 "Warning"
# Output: Multi-Attach error for volume "pvc-abc123"

Fix — Force pod to the correct node:

bash

# Find which node the PV is on
kubectl get pv | grep myapp
 
# Get node affinity of the PV
kubectl describe pv <pv-name> | grep -A 5 "Node Affinity"
 
# Add node affinity to the StatefulSet to match

Or delete the stuck pod and let K8s reschedule it (sometimes resolves automatically):

bash

kubectl delete pod myapp-0 -n <namespace>

Full Diagnostic Script

Run this to get the full picture in one shot:

bash

NS=<your-namespace>
SS=<statefulset-name>
 
echo "=== StatefulSet Status ==="
kubectl get statefulset $SS -n $NS
 
echo "=== Pods ==="
kubectl get pods -l app=$SS -n $NS -o wide
 
echo "=== PVCs ==="
kubectl get pvc -n $NS | grep $SS
 
echo "=== Recent Events ==="
kubectl get events -n $NS --sort-by='.lastTimestamp' | tail -15
 
echo "=== Pod-0 Description ==="
kubectl describe pod ${SS}-0 -n $NS | tail -30

Prevention Tips

Always define resource requests and limits — prevents OOM killing during init
Set generous initialDelaySeconds on readiness probes for stateful apps
Use podManagementPolicy: Parallel if pod ordering doesn't matter for your app — faster scaling
Use Retain reclaim policy on PVs in production — prevents data loss on accidental PVC delete

If you want hands-on practice with StatefulSets and storage, KodeKloud's CKA course has dedicated labs where you debug exactly these scenarios on live clusters.

Kubernetes StatefulSet Pods Stuck on Init or Pending: Fix It Now

Why StatefulSets Are Different

Step 1 — Check Pod Status

Cause 1: PVC Stuck in Pending

Cause 2: Headless Service Missing or Misconfigured

Cause 3: Pod Ordering — Previous Pod Not Ready

Cause 4: Init Container Failing

Cause 5: PVC Already Bound to a Deleted Pod on a Different Node

Full Diagnostic Script

Prevention Tips

Stay ahead of the curve

Related Articles

AWS EKS Pods Stuck in Pending State: Causes and Fixes

AWS EKS Worker Nodes Not Joining the Cluster: Complete Fix Guide

AWS RDS Connection Timeout from EKS Pods — How to Fix It

Comments