All Articles

AWS EKS Pods Stuck in Pending State: Causes and Fixes

Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.

DevOpsBoysMar 15, 20266 min read
Share:Tweet

Your pod is scheduled, but it just sits there. kubectl get pods shows Pending next to it. No container is starting. No error message in the app logs.

Pods stuck in Pending on EKS are one of the most common issues DevOps engineers hit — and the cause is almost always one of a small set of known problems.

This guide walks through every common cause and how to fix it.


Step 1: Always Start Here

When a pod is stuck in Pending, the first command is always:

bash
kubectl describe pod <pod-name> -n <namespace>

Look at the Events section at the bottom. It tells you exactly why Kubernetes isn't scheduling the pod:

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----                -------
  Warning  FailedScheduling  2m    default-scheduler   0/3 nodes are available:
                                                       3 Insufficient cpu.

Read that message carefully. It contains the fix.


Cause 1: Insufficient Resources (CPU/Memory)

The most common cause. Your pods request more CPU or memory than any node has available.

0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory.

Diagnose

bash
# Check node capacity and what's allocated
kubectl describe nodes | grep -A 5 "Allocated resources"
 
# Or use top
kubectl top nodes

Example output showing a node that's nearly full:

Resource          Requests    Limits
--------          --------    ------
cpu               1850m/2     0/2
memory            1.8Gi/2Gi   0/2Gi

Fix options

Option A: Add more nodes (scale the node group)

bash
# Scale via eksctl
eksctl scale nodegroup \
  --cluster=my-cluster \
  --name=ng-standard \
  --nodes=5 \
  --region=us-east-1
 
# Or via AWS console: EKS → Clusters → Node Groups → Edit

Option B: Reduce pod resource requests

yaml
# Before — too greedy
resources:
  requests:
    cpu: "2"
    memory: "4Gi"
 
# After — right-sized
resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "1"
    memory: "1Gi"

Option C: Enable Cluster Autoscaler or Karpenter

Cluster Autoscaler automatically adds nodes when pending pods exist:

bash
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system \
  --set autoDiscovery.clusterName=my-cluster \
  --set awsRegion=us-east-1

Cause 2: Node Selector or Affinity Mismatch

0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector.

Your pod requires a node with specific labels, but no node has them.

Diagnose

bash
# Check what labels your nodes have
kubectl get nodes --show-labels
 
# Check what your pod requires
kubectl get pod <pod-name> -o yaml | grep -A 10 "affinity\|nodeSelector"

Fix

yaml
# Wrong — label doesn't exist on any node
nodeSelector:
  eks.amazonaws.com/nodegroup: gpu-nodes    # no gpu nodes in cluster
 
# Right — label that actually exists
nodeSelector:
  eks.amazonaws.com/nodegroup: standard-nodes

For node affinity:

yaml
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/arch
              operator: In
              values:
                - amd64    # make sure your nodes are amd64

Cause 3: Taint and Toleration Mismatch

0/3 nodes are available: 3 node(s) had untolerated taint {dedicated: gpu}.

Your nodes have taints that your pod doesn't tolerate.

Diagnose

bash
# Check node taints
kubectl describe node <node-name> | grep Taints

Fix

If your pod should run on tainted nodes, add the toleration:

yaml
tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

If your pod should NOT run on those nodes, the scheduler is correctly blocking it — check if there are untainted nodes available.

To remove a taint from a node:

bash
kubectl taint node <node-name> dedicated:NoSchedule-

Cause 4: PersistentVolumeClaim Not Bound

0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.

The pod needs a PVC that doesn't exist or hasn't been provisioned.

Diagnose

bash
kubectl get pvc -n <namespace>

If the PVC shows Pending:

NAME        STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-data     Pending                                      gp2            5m

Then:

bash
kubectl describe pvc my-data -n <namespace>

Common PVC causes and fixes

StorageClass doesn't exist:

bash
# Check available storage classes
kubectl get storageclass
 
# Fix — use an existing storage class
yaml
# Wrong
storageClassName: fast-ssd    # doesn't exist
 
# Right
storageClassName: gp2         # exists on EKS by default

No available PersistentVolume (for static provisioning):

bash
# Check PVs
kubectl get pv
 
# Create a PV manually if using static provisioning

EBS volume in wrong availability zone:

EKS EBS volumes are AZ-specific. If your pod lands on a node in us-east-1b but the volume is in us-east-1a, it won't attach.

Fix: Use topology-aware volume binding:

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp2-az-aware
provisioner: ebs.csi.aws.com
parameters:
  type: gp2
volumeBindingMode: WaitForFirstConsumer    # bind after pod is scheduled to a node

Cause 5: No Nodes Available (All at Max Pods)

EKS limits the number of pods per node based on instance type and the VPC CNI plugin. A t3.medium can run a maximum of 17 pods.

0/3 nodes are available: 3 Too many pods.

Check current pod counts per node

bash
kubectl get pods -A -o wide | awk '{print $8}' | sort | uniq -c | sort -rn

Fix options

Option A: Use larger instance types (more ENIs = more IPs = more pods)

InstanceMax Pods
t3.small11
t3.medium17
t3.large35
m5.xlarge58

Option B: Enable prefix delegation (supports 110 pods/node)

bash
kubectl set env daemonset aws-node -n kube-system \
  ENABLE_PREFIX_DELEGATION=true \
  WARM_PREFIX_TARGET=1

Then update node groups to use the new pod limit:

bash
eksctl update nodegroup \
  --cluster=my-cluster \
  --name=ng-standard \
  --max-pods-per-node=110

Cause 6: Image Pull Issue (Looks Like Pending But Isn't)

Sometimes pods get stuck in ContainerCreating (not Pending) due to image pull failures — but the root cause is similar: the pod can't start.

bash
kubectl describe pod <pod-name> | grep -A 5 "Warning"
Warning  Failed     2m    kubelet  Failed to pull image "my-ecr-image:latest":
         rpc error: code = Unknown desc = failed to pull and unpack image:
         failed to resolve reference: unexpected status code 401 Unauthorized

Fix ECR authentication on EKS

bash
# Attach the ECR read policy to your node group IAM role
aws iam attach-role-policy \
  --role-name eks-node-role \
  --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

Diagnosis Quick Reference

bash
# Step 1: See why the pod isn't scheduled
kubectl describe pod <pod-name> -n <namespace>
 
# Step 2: Check node capacity
kubectl describe nodes | grep -A 5 "Allocated resources"
 
# Step 3: Check for PVC issues
kubectl get pvc -n <namespace>
kubectl describe pvc <name> -n <namespace>
 
# Step 4: Check node taints
kubectl describe nodes | grep Taints
 
# Step 5: Check node labels
kubectl get nodes --show-labels
 
# Step 6: Check pod events in real time
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

Summary Table

MessageCauseFix
Insufficient CPU/MemoryPod requests exceed node capacityScale nodes or reduce requests
Node affinity mismatchnodeSelector label doesn't existFix label or add to node
Untolerated taintNode has taint, pod lacks tolerationAdd toleration to pod
Unbound PVCStorageClass missing or wrong AZFix StorageClass or use WaitForFirstConsumer
Too many podsNode reached ENI pod limitUse larger instances or prefix delegation
401 Unauthorized (image)ECR permissions missingAttach ECR read policy to node role

Almost every EKS Pending pod issue falls into one of these six categories. kubectl describe pod will point you to the right one within seconds.


Level Up Your EKS Skills

For a comprehensive path through EKS — networking, autoscaling, IAM, and production patterns — KodeKloud's Kubernetes and AWS courses give you hands-on labs with real clusters. No slides-only learning.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments