Kubernetes HPA Not Scaling — How to Fix It (2026)
Your Horizontal Pod Autoscaler is configured but pods aren't scaling up or down. Here's every reason HPA fails and the exact fix for each.
You set up HPA expecting it to scale automatically. Load spikes. Nothing happens. Or it scales up but never scales down. Here's what's wrong and how to fix it.
Step 1: Check HPA Status
kubectl get hpa -n your-namespace
kubectl describe hpa your-hpa -n your-namespaceLook for the Conditions section and Events. The describe output tells you exactly why HPA isn't working.
Cause 1: Metrics Server Not Installed
HPA needs the Metrics Server to get CPU and memory metrics. Without it, HPA shows <unknown>:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
myapp Deployment/myapp <unknown>/50% 2 10 2
Check:
kubectl get deployment metrics-server -n kube-system
kubectl top pods -n your-namespace # If this fails, metrics-server is missingFix — Install Metrics Server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlFor clusters where nodes don't have proper TLS certificates (local clusters):
# Add --kubelet-insecure-tls to metrics-server deployment args
kubectl patch deployment metrics-server -n kube-system \
--type='json' \
-p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'Cause 2: No Resource Requests Set
HPA scales based on CPU/memory utilization percentage. If your pods don't have resource requests set, HPA can't calculate utilization — it has nothing to compare against.
kubectl describe hpa your-hpa -n your-namespace
# "failed to get cpu utilization: missing request for cpu"Fix — add resource requests to your Deployment:
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"Without requests, HPA literally cannot work.
Cause 3: HPA Configured Wrong
# Wrong — targets the wrong metric name
spec:
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50 # Scale when CPU > 50%Common mistakes:
averageUtilization: 50means 50% of the requested CPU, not the limit- Using
AverageValueinstead ofUtilizationrequires absolute values (e.g.,500m)
Verify HPA is reading metrics:
kubectl get hpa your-hpa -n your-namespace
# TARGETS should show: 25%/50% (current/target) not <unknown>/50%Cause 4: Not Scaling Up — Load Not High Enough
HPA only scales up when average utilization across all pods exceeds the target. One pod at 100% won't trigger scale-up if other pods are at 10%.
# Check per-pod CPU usage
kubectl top pods -n your-namespace -l app=myappAlso check the HPA scale-up cooldown:
spec:
behavior:
scaleUp:
stabilizationWindowSeconds: 0 # Default 0 — scales up immediately
policies:
- type: Percent
value: 100
periodSeconds: 15Cause 5: Not Scaling Down — Stabilization Window
HPA has a default 5-minute stabilization window for scale-down to prevent flapping. After load drops, it waits 5 minutes before reducing pods.
kubectl describe hpa your-hpa | grep "ScaleDown"
# "ScaleDown: allowed in ... s"If you want faster scale-down:
spec:
behavior:
scaleDown:
stabilizationWindowSeconds: 60 # Reduce to 1 minute
policies:
- type: Pods
value: 1
periodSeconds: 60Cause 6: Already at minReplicas or maxReplicas
HPA can't scale below minReplicas or above maxReplicas. Check:
kubectl get hpa your-hpa -n your-namespace
# If REPLICAS = MAXPODS, you've hit the ceilingFix: Increase maxReplicas or check if your app is memory-leaking (causing permanently high CPU).
Cause 7: Custom Metrics Not Working (KEDA/Prometheus Adapter)
If you're scaling on custom metrics (queue depth, request count), you need either Prometheus Adapter or KEDA.
# Check if custom metrics are available
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .If the API doesn't exist — install Prometheus Adapter or KEDA.
KEDA is much simpler for custom metrics:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: myapp-scaler
spec:
scaleTargetRef:
name: myapp
minReplicaCount: 0
maxReplicaCount: 20
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
threshold: "100"
query: sum(rate(http_requests_total{app="myapp"}[1m]))Quick Debug Commands
# HPA status + conditions
kubectl describe hpa myapp -n production
# Are metrics available?
kubectl top pods -n production
# Events explaining why HPA isn't acting
kubectl get events -n production | grep HPA
# Current resource usage vs requests
kubectl describe pod <pod> -n production | grep -A 4 "Requests\|Limits"HPA Best Practices
- Always set
resources.requests— HPA can't work without them - Set
minReplicas >= 2for production — single pod = single point of failure - Don't set
maxReplicastoo low — defeats the purpose - Use
scaleDown.stabilizationWindowSecondsto prevent flapping - For queue-based scaling (SQS, Kafka), use KEDA instead of HPA
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
ArgoCD Image Updater Not Syncing — Fix Guide
ArgoCD Image Updater detects a new image tag but doesn't update the Application. Here's how to diagnose and fix annotation errors, registry auth issues, write-back problems, and sync failures.
AWS EKS Pods Stuck in Pending State: Causes and Fixes
Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.
AWS EKS Worker Nodes Not Joining the Cluster: Complete Fix Guide
EKS worker nodes stuck in NotReady or not appearing at all? Here are all the causes and step-by-step fixes for node bootstrap failures.