🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

kubectl drain Stuck Forever? Here's the Exact Fix (PDB + Non-Evictable Pods)

kubectl drain hanging with no output? PodDisruptionBudget or DaemonSet pods are blocking it. Here's how to diagnose and fix it without nuking your cluster.

DevOpsBoysJun 12, 20264 min read
Share:Tweet

You run kubectl drain node-01 --ignore-daemonsets before a maintenance window. It starts, prints one or two pod names, then just... sits there. No progress. No error. Just a blinking cursor mocking you.

Twenty minutes later your maintenance window is half gone and you're sweating.

I've been here more times than I can count. Here's exactly what's happening and how to fix it.

Why kubectl drain Hangs

kubectl drain does two things: it cordons the node (marks it unschedulable) and then evicts every pod on it. The eviction is the part that hangs.

Kubernetes won't evict a pod if doing so would violate a PodDisruptionBudget (PDB). This is a feature, not a bug — PDBs exist to protect your application's availability during disruptions. The problem is when PDBs are misconfigured, or when the budget can't be satisfied because of how your pods are currently distributed.

Other common blockers:

  • DaemonSet pods (requires --ignore-daemonsets)
  • Pods with emptyDir volumes (requires --delete-emptydir-data)
  • Pods in Terminating state that won't die
  • Pods owned by nothing (no controller) — these are never automatically rescheduled

Step 1: See What's Actually Blocking

bash
kubectl drain node-01 --ignore-daemonsets --dry-run=client

This tells you what would be evicted without doing anything. Look for:

error: cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet

Or:

error: Cannot evict pod as it would violate the pod's disruption budget.

These are your two most common culprits.

Step 2: Find the PDB That's Blocking

bash
kubectl get pdb -A

Look at the ALLOWED DISRUPTIONS column:

NAMESPACE   NAME            MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS
production  api-pdb         3               N/A               0
staging     worker-pdb      1               N/A               1

ALLOWED DISRUPTIONS: 0 means Kubernetes cannot evict a single pod from that group without violating the budget. This happens when:

  1. You have minAvailable: 3 but only 3 replicas running — any eviction would drop below minimum
  2. You have maxUnavailable: 0 (which means zero pods allowed to be down — why would you do this?)
  3. A pod is already in a non-ready state, consuming the disruption budget

Fix: Check Current Pod Count vs PDB Min

bash
# Check how many replicas are actually running
kubectl get deployment -n production api-deployment
 
# Compare against the PDB
kubectl describe pdb api-pdb -n production

If minAvailable equals your current replica count, scale up first:

bash
kubectl scale deployment api-deployment -n production --replicas=4

Wait for the new pod to be Ready, then drain again.

Step 3: Find Non-Evictable Pods (No Controller)

bash
kubectl get pods --field-selector=spec.nodeName=node-01 -A -o wide

Then check each pod's owner:

bash
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.metadata.ownerReferences}'

If this returns empty [] — that pod has no controller. Kubernetes won't reschedule it anywhere. You need to decide: delete it manually, or understand why it exists without a controller (probably a one-off debugging pod someone forgot about).

bash
kubectl delete pod <orphan-pod-name> -n <namespace>

Step 4: Stuck Terminating Pods

Sometimes pods get stuck in Terminating forever. This usually means the pod's finalizer isn't releasing.

bash
kubectl get pods -n production | grep Terminating

Force delete (use with caution — only after confirming the pod process is actually dead):

bash
kubectl delete pod <pod-name> -n production --grace-period=0 --force

If that still doesn't work, the pod has a finalizer that's preventing deletion:

bash
kubectl patch pod <pod-name> -n production -p '{"metadata":{"finalizers":[]}}' --type=merge

Step 5: The Nuclear Option (Don't Do This In Production Without Understanding Why)

If you've exhausted everything and truly need to proceed:

bash
kubectl drain node-01 \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --disable-eviction \
  --force \
  --grace-period=30

--disable-eviction bypasses the PDB check entirely and directly deletes pods. --force handles pods with no controller. This will violate your PDBs — meaning your application may have reduced availability during the drain. Know what you're doing before using it.

Prevention: Write PDBs That Don't Block Drains

The ideal PDB allows at least one disruption at all times:

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
  namespace: production
spec:
  maxUnavailable: 1          # At most 1 pod can be down
  selector:
    matchLabels:
      app: api

Using maxUnavailable: 1 instead of minAvailable: N is almost always the better choice. It scales with your replica count and always allows at least one eviction.

If you're running a 2-replica deployment, minAvailable: 2 will permanently block all drains. Don't do it.

Quick Diagnosis Checklist

When kubectl drain hangs:

  1. kubectl get pdb -A — check ALLOWED DISRUPTIONS column
  2. kubectl get pods --field-selector=spec.nodeName=<node> -A — look for pods in weird states
  3. kubectl describe node <node> — check conditions and events
  4. kubectl get pods -A | grep Terminating — find stuck pods
  5. Scale up deployments if PDB minAvailable equals current replicas

Most drain hangs are solved by step 1 alone. The PDB is almost always the culprit.


Draining nodes is routine until it isn't. Understanding PDBs properly is what separates engineers who panic during maintenance windows from those who fix it in 5 minutes and still make it to standup on time.

Recommended reading: Kubernetes Resource Calculator — size your replicas correctly before setting PDB minimums.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments