Kubernetes Readiness Probe Failing — How to Debug and Fix It
Kubernetes readiness probe keeps failing? Learn the exact commands to diagnose misconfigured HTTP, TCP, and exec probes and fix them permanently.
Your pod is running but traffic never reaches it. kubectl get pods shows 0/1 READY. The readiness probe is failing — and every request hits a different pod while yours sits idle.
This is one of the most common Kubernetes issues that catches engineers off guard. Here's how to find the exact cause and fix it.
What Readiness Probes Do
Kubernetes sends readiness probe requests to your pod. If the probe fails, the pod is removed from the Service's endpoint list — no traffic is routed to it. Unlike liveness probes (which restart the pod), a failing readiness probe just quietly stops traffic.
kubectl get endpoints my-service
# NAME ENDPOINTS AGE
# my-service <none> 5m ← pod is running but not in endpoints
Step 1: See Why the Probe Is Failing
kubectl describe pod <pod-name>Look for the Events section at the bottom:
Warning Unhealthy 2m kubelet Readiness probe failed:
Get "http://10.244.1.5:8080/health": dial tcp 10.244.1.5:8080: connect: connection refused
This tells you exactly what's happening. Common errors:
| Error | Meaning |
|---|---|
connection refused | App not listening on that port |
HTTP 404 | Health check path doesn't exist |
context deadline exceeded | App too slow to respond |
exec probe failed | Script exiting non-zero |
Problem 1: Wrong Port in Probe
Your app listens on 3000 but the probe checks 8080.
# Wrong
readinessProbe:
httpGet:
path: /health
port: 8080 # ← app is actually on 3000
# Fixed
readinessProbe:
httpGet:
path: /health
port: 3000Verify what port your app is actually on:
kubectl exec -it <pod-name> -- ss -tlnp
# or
kubectl exec -it <pod-name> -- netstat -tlnpProblem 2: Health Endpoint Doesn't Exist
Your probe hits /health but your app has no such route.
# Test from inside the pod
kubectl exec -it <pod-name> -- curl -v http://localhost:8080/health
# If you get 404, you need to either:
# 1. Add the /health endpoint to your app
# 2. Change the probe path to one that existsFix: Add a health endpoint to your application, or change the probe to check / or a path that returns 200:
readinessProbe:
httpGet:
path: / # use a path that definitely returns 200
port: 8080Problem 3: App Takes Too Long to Start
Your app needs 30 seconds to warm up, but Kubernetes starts probing immediately and marks it unhealthy before it's ready.
# Wrong — starts probing at 0 seconds
readinessProbe:
httpGet:
path: /health
port: 8080
periodSeconds: 10
failureThreshold: 3
# Fixed — give it time to start
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30 # wait 30s before first probe
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 5Use startupProbe for apps with variable startup times — it disables the readiness probe until the startup probe passes:
startupProbe:
httpGet:
path: /health
port: 8080
failureThreshold: 30 # try for up to 5 minutes (30 × 10s)
periodSeconds: 10Problem 4: Exec Probe Script Failing
readinessProbe:
exec:
command:
- /bin/sh
- -c
- "pg_isready -U postgres"Debug by running the command manually:
kubectl exec -it <pod-name> -- /bin/sh -c "pg_isready -U postgres"
echo $? # must be 0 for healthyIf it exits non-zero, fix the script. Common issue: the tool isn't installed in the container image.
Problem 5: Resource Constraints Causing Slowness
If your pod is CPU-throttled, the health endpoint may time out under load.
kubectl top pod <pod-name>If CPU is at limits, the probe times out. Either increase the timeoutSeconds or raise the CPU limit:
resources:
requests:
cpu: "250m"
limits:
cpu: "500m"
readinessProbe:
httpGet:
path: /health
port: 8080
timeoutSeconds: 10 # give it more time
failureThreshold: 5Problem 6: HTTPS Probe on HTTP App
# Wrong — using HTTPS for an HTTP app
readinessProbe:
httpGet:
path: /health
port: 8443
scheme: HTTPS # ← app doesn't serve HTTPS
# Fixed
readinessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTPThe Right Probe Configuration
A solid production readiness probe setup:
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 15
timeoutSeconds: 5
failureThreshold: 3
startupProbe:
httpGet:
path: /health/live
port: 8080
failureThreshold: 20
periodSeconds: 10Have separate /health/ready (checks DB connections, dependencies) and /health/live (just checks if the process is alive) endpoints in your app.
Quick Diagnostic Checklist
# 1. Check pod status and events
kubectl describe pod <pod-name> | grep -A 20 "Events:"
# 2. Check if app is listening on the right port
kubectl exec -it <pod-name> -- ss -tlnp
# 3. Manually hit the health endpoint
kubectl exec -it <pod-name> -- curl -sv http://localhost:<port><path>
# 4. Check endpoints
kubectl get endpoints <service-name>
# 5. Check resource usage
kubectl top pod <pod-name>Learn More
If you want to master Kubernetes troubleshooting end to end, the Certified Kubernetes Administrator (CKA) course on Udemy covers probe debugging in depth with hands-on labs. Pair it with Kubernetes in Action by Marko Lukša for the theory behind how probes work.
Readiness probes are a 2-minute fix once you know where to look. The key is always running kubectl describe pod first — Kubernetes tells you exactly what failed.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS EKS Pods Stuck in Pending State: Causes and Fixes
Pods stuck in Pending on EKS are caused by a handful of known issues — insufficient node capacity, taint mismatches, PVC problems, and more. Here's how to diagnose and fix each one.
AWS RDS Connection Timeout from EKS Pods — How to Fix It
EKS pods can't connect to RDS? Fix RDS connection timeouts from Kubernetes — covers security groups, VPC peering, subnet routing, and IAM auth issues.
cert-manager Certificate Not Ready: Causes and Fixes
cert-manager Certificate stuck in a non-Ready state is a common Kubernetes TLS issue. This guide covers every root cause — DNS challenges, RBAC, rate limits, and issuer problems — with step-by-step fixes.