Kubernetes HPA Not Scaling: Why Your Pods Refuse to Scale and How to Fix It
Kubernetes HPA not scaling your pods? Walk through every root cause — missing metrics, wrong resource requests, cooldown periods — and fix each one systematically.
Your Horizontal Pod Autoscaler is configured. Your traffic is spiking. And your pods... just sit there. HPA not scaling is one of the most frustrating Kubernetes issues because there are at least 8 different root causes and each one looks the same from the outside.
This guide walks through every reason HPA fails to scale and exactly how to fix it.
The Fast Diagnostic Command
Before anything else, run this:
kubectl describe hpa <hpa-name> -n <namespace>Look at the Conditions and Events sections. This alone will tell you 80% of the time why HPA is stuck.
Common condition messages:
unable to get metrics for resource cpu→ metrics-server missing or brokeninvalid metrics (1 invalid out of 1)→ resource requests not setDesiredReplicas below minimum→ already at min replicas, nothing to scale downScalingActive=false→ HPA is disabled or misconfigured
Root Cause 1: Metrics Server Not Installed
HPA needs metrics-server to read CPU and memory. If it's not installed, HPA can't do anything.
Check:
kubectl top pods -n <namespace>If you get error: Metrics API not available, metrics-server is missing.
Fix:
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm install metrics-server metrics-server/metrics-server \
--namespace kube-system \
--set args[0]="--kubelet-insecure-tls"Wait 60 seconds, then verify:
kubectl top nodes
kubectl top pods -n <namespace>Affiliate tip: KodeKloud has excellent Kubernetes labs where you practice HPA hands-on in real clusters.
Root Cause 2: Resource Requests Not Set
HPA calculates utilization as actual_usage / requested_amount. If you haven't set resource requests on your containers, the denominator is zero — HPA can't compute utilization and refuses to scale.
Check:
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[0].resources}'If this returns {} or shows no requests, that's your problem.
Fix — add resource requests to your deployment:
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"Rule of thumb: set requests to your average load, limits to your peak load.
Root Cause 3: HPA Cooldown Periods
HPA has built-in cooldown to prevent thrashing. By default:
- Scale-up cooldown: 0 seconds (scales up immediately)
- Scale-down cooldown: 5 minutes (waits 5 min before scaling down)
If you scaled up recently and are now expecting scale-down, wait 5 minutes. It's not broken.
Check the last scale time:
kubectl describe hpa <hpa-name> | grep "Last Scale Time"Adjust cooldown behavior (Kubernetes 1.18+):
spec:
behavior:
scaleDown:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Pods
value: 4
periodSeconds: 15Root Cause 4: Already at Min/Max Replicas
If HPA wants to scale down but you're at minReplicas, or wants to scale up but you're at maxReplicas, it simply won't do anything.
Check:
kubectl get hpa <hpa-name> -n <namespace>Look at the MINPODS, MAXPODS, and REPLICAS columns.
Fix: Adjust min/max to match your actual needs:
spec:
minReplicas: 2
maxReplicas: 20Root Cause 5: CPU Usage Below Target Threshold
HPA only scales if currentUtilization / targetUtilization significantly exceeds 1.0. There's a 10% tolerance band — if target is 50% and actual is 55%, HPA won't scale.
Check current utilization:
kubectl describe hpa <hpa-name> | grep -A5 "Metrics:"If actual usage is close to target but within the tolerance band, that's by design. Lower your target:
spec:
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 40 # was 50, now lowerRoot Cause 6: Wrong API Version
The old autoscaling/v1 HPA only supports CPU. For memory or custom metrics, you need autoscaling/v2.
Check what you're using:
kubectl get hpa <hpa-name> -o yaml | grep "apiVersion"Fix — upgrade to v2:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70Root Cause 7: Custom Metrics Not Available (Prometheus Adapter)
If you're scaling on custom metrics (requests-per-second, queue depth, etc.), you need Prometheus Adapter or KEDA. Without it, HPA shows no metrics found.
Check:
kubectl get apiservice v1beta1.custom.metrics.k8s.ioIf it's missing, install KEDA instead — it's far easier than Prometheus Adapter:
helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespaceThen use a ScaledObject instead of HPA for custom metric scaling.
Root Cause 8: Deployment Selector Mismatch
HPA targets a deployment by name. If the deployment was renamed or the target reference is wrong, HPA shows ScaleTargetRef must not be empty.
Check:
kubectl describe hpa <hpa-name> | grep "Reference:"Verify the deployment name matches exactly:
kubectl get deployments -n <namespace>Complete HPA Troubleshooting Checklist
# 1. Check HPA status and events
kubectl describe hpa <name> -n <ns>
# 2. Verify metrics-server works
kubectl top pods -n <ns>
# 3. Check resource requests on pods
kubectl get pod <pod> -o jsonpath='{.spec.containers[*].resources}'
# 4. Check current vs desired replicas
kubectl get hpa <name> -n <ns>
# 5. Check pod events for resource issues
kubectl get events -n <ns> --sort-by='.lastTimestamp'Production HPA Template
A solid HPA config for most web services:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
behavior:
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 25
periodSeconds: 60Summary
| Problem | Fix |
|---|---|
HPA shows unable to get metrics | Install metrics-server |
invalid metrics error | Set resource requests on containers |
| HPA not scaling down | Wait for 5-min cooldown |
| Stuck at min/max replicas | Adjust minReplicas/maxReplicas |
| CPU near target but no scale | Lower averageUtilization target |
| Memory scaling not working | Use autoscaling/v2 API |
| Custom metrics not found | Install KEDA or Prometheus Adapter |
Fix the metrics-server and resource requests first — they cover 90% of HPA failures. Everything else is tuning.
Building a Kubernetes-heavy platform? DigitalOcean Kubernetes offers managed clusters with autoscaling built-in — great for production workloads.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
AWS EKS Pods Stuck in Pending State: Causes and Fixes
Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.
Build a Complete Kubernetes Monitoring Stack from Scratch (2026)
Step-by-step project walkthrough: set up Prometheus, Grafana, Loki, and AlertManager on Kubernetes using Helm. Real configs, real dashboards, production-ready.