AWS EKS Pods Stuck in Pending State: Causes and Fixes
Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.
Your pod is scheduled, but it just sits there. kubectl get pods shows Pending next to it. No container is starting. No error message in the app logs.
Pods stuck in Pending on EKS are one of the most common issues DevOps engineers hit — and the cause is almost always one of a small set of known problems.
This guide walks through every common cause and how to fix it.
Step 1: Always Start Here
When a pod is stuck in Pending, the first command is always:
kubectl describe pod <pod-name> -n <namespace>Look at the Events section at the bottom. It tells you exactly why Kubernetes isn't scheduling the pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m default-scheduler 0/3 nodes are available:
3 Insufficient cpu.
Read that message carefully. It contains the fix.
Cause 1: Insufficient Resources (CPU/Memory)
The most common cause. Your pods request more CPU or memory than any node has available.
0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory.
Diagnose
# Check node capacity and what's allocated
kubectl describe nodes | grep -A 5 "Allocated resources"
# Or use top
kubectl top nodesExample output showing a node that's nearly full:
Resource Requests Limits
-------- -------- ------
cpu 1850m/2 0/2
memory 1.8Gi/2Gi 0/2Gi
Fix options
Option A: Add more nodes (scale the node group)
# Scale via eksctl
eksctl scale nodegroup \
--cluster=my-cluster \
--name=ng-standard \
--nodes=5 \
--region=us-east-1
# Or via AWS console: EKS → Clusters → Node Groups → EditOption B: Reduce pod resource requests
# Before — too greedy
resources:
requests:
cpu: "2"
memory: "4Gi"
# After — right-sized
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"Option C: Enable Cluster Autoscaler or Karpenter
Cluster Autoscaler automatically adds nodes when pending pods exist:
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1Cause 2: Node Selector or Affinity Mismatch
0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector.
Your pod requires a node with specific labels, but no node has them.
Diagnose
# Check what labels your nodes have
kubectl get nodes --show-labels
# Check what your pod requires
kubectl get pod <pod-name> -o yaml | grep -A 10 "affinity\|nodeSelector"Fix
# Wrong — label doesn't exist on any node
nodeSelector:
eks.amazonaws.com/nodegroup: gpu-nodes # no gpu nodes in cluster
# Right — label that actually exists
nodeSelector:
eks.amazonaws.com/nodegroup: standard-nodesFor node affinity:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64 # make sure your nodes are amd64Cause 3: Taint and Toleration Mismatch
0/3 nodes are available: 3 node(s) had untolerated taint {dedicated: gpu}.
Your nodes have taints that your pod doesn't tolerate.
Diagnose
# Check node taints
kubectl describe node <node-name> | grep TaintsFix
If your pod should run on tainted nodes, add the toleration:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"If your pod should NOT run on those nodes, the scheduler is correctly blocking it — check if there are untainted nodes available.
To remove a taint from a node:
kubectl taint node <node-name> dedicated:NoSchedule-Cause 4: PersistentVolumeClaim Not Bound
0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
The pod needs a PVC that doesn't exist or hasn't been provisioned.
Diagnose
kubectl get pvc -n <namespace>If the PVC shows Pending:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-data Pending gp2 5m
Then:
kubectl describe pvc my-data -n <namespace>Common PVC causes and fixes
StorageClass doesn't exist:
# Check available storage classes
kubectl get storageclass
# Fix — use an existing storage class# Wrong
storageClassName: fast-ssd # doesn't exist
# Right
storageClassName: gp2 # exists on EKS by defaultNo available PersistentVolume (for static provisioning):
# Check PVs
kubectl get pv
# Create a PV manually if using static provisioningEBS volume in wrong availability zone:
EKS EBS volumes are AZ-specific. If your pod lands on a node in us-east-1b but the volume is in us-east-1a, it won't attach.
Fix: Use topology-aware volume binding:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp2-az-aware
provisioner: ebs.csi.aws.com
parameters:
type: gp2
volumeBindingMode: WaitForFirstConsumer # bind after pod is scheduled to a nodeCause 5: No Nodes Available (All at Max Pods)
EKS limits the number of pods per node based on instance type and the VPC CNI plugin. A t3.medium can run a maximum of 17 pods.
0/3 nodes are available: 3 Too many pods.
Check current pod counts per node
kubectl get pods -A -o wide | awk '{print $8}' | sort | uniq -c | sort -rnFix options
Option A: Use larger instance types (more ENIs = more IPs = more pods)
| Instance | Max Pods |
|---|---|
| t3.small | 11 |
| t3.medium | 17 |
| t3.large | 35 |
| m5.xlarge | 58 |
Option B: Enable prefix delegation (supports 110 pods/node)
kubectl set env daemonset aws-node -n kube-system \
ENABLE_PREFIX_DELEGATION=true \
WARM_PREFIX_TARGET=1Then update node groups to use the new pod limit:
eksctl update nodegroup \
--cluster=my-cluster \
--name=ng-standard \
--max-pods-per-node=110Cause 6: Image Pull Issue (Looks Like Pending But Isn't)
Sometimes pods get stuck in ContainerCreating (not Pending) due to image pull failures — but the root cause is similar: the pod can't start.
kubectl describe pod <pod-name> | grep -A 5 "Warning"Warning Failed 2m kubelet Failed to pull image "my-ecr-image:latest":
rpc error: code = Unknown desc = failed to pull and unpack image:
failed to resolve reference: unexpected status code 401 Unauthorized
Fix ECR authentication on EKS
# Attach the ECR read policy to your node group IAM role
aws iam attach-role-policy \
--role-name eks-node-role \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnlyDiagnosis Quick Reference
# Step 1: See why the pod isn't scheduled
kubectl describe pod <pod-name> -n <namespace>
# Step 2: Check node capacity
kubectl describe nodes | grep -A 5 "Allocated resources"
# Step 3: Check for PVC issues
kubectl get pvc -n <namespace>
kubectl describe pvc <name> -n <namespace>
# Step 4: Check node taints
kubectl describe nodes | grep Taints
# Step 5: Check node labels
kubectl get nodes --show-labels
# Step 6: Check pod events in real time
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20Summary Table
| Message | Cause | Fix |
|---|---|---|
| Insufficient CPU/Memory | Pod requests exceed node capacity | Scale nodes or reduce requests |
| Node affinity mismatch | nodeSelector label doesn't exist | Fix label or add to node |
| Untolerated taint | Node has taint, pod lacks toleration | Add toleration to pod |
| Unbound PVC | StorageClass missing or wrong AZ | Fix StorageClass or use WaitForFirstConsumer |
| Too many pods | Node reached ENI pod limit | Use larger instances or prefix delegation |
| 401 Unauthorized (image) | ECR permissions missing | Attach ECR read policy to node role |
Almost every EKS Pending pod issue falls into one of these six categories. kubectl describe pod will point you to the right one within seconds.
Level Up Your EKS Skills
For a comprehensive path through EKS — networking, autoscaling, IAM, and production patterns — KodeKloud's Kubernetes and AWS courses give you hands-on labs with real clusters. No slides-only learning.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS EKS vs Google GKE vs Azure AKS — Which Managed Kubernetes to Use in 2026?
Honest comparison of EKS, GKE, and AKS in 2026: pricing, developer experience, networking, autoscaling, and which one to pick for your use case.
Build a Complete AWS Infrastructure with Terraform from Scratch (2026)
Full project walkthrough: provision a production-grade AWS VPC, EKS cluster, RDS, S3, and IAM with Terraform. Real code, real architecture, ready to use.
Build a Complete CI/CD Pipeline with GitHub Actions + ArgoCD + EKS (2026)
A full project walkthrough — from a simple app to a production-grade GitOps pipeline with automated builds, image scanning, and deployments to AWS EKS using ArgoCD.