Nginx Ingress 'upstream connect error' Fix in Kubernetes
Debug and fix Nginx Ingress 502/504 errors caused by 'upstream connect error or disconnect/reset before headers' in Kubernetes. Step-by-step with real commands.
You deployed your app, set up Nginx Ingress, and then your users hit a wall: 502 Bad Gateway or 504 Gateway Timeout with the cryptic message upstream connect error or disconnect/reset before headers. Here is exactly how to debug and fix it.
What This Error Actually Means
Nginx Ingress is a reverse proxy. When it shows this error, it means Nginx reached the backend pod but either:
- The pod was not ready to accept connections
- The pod closed the connection before responding
- The service was pointing to the wrong port
- A health check was failing and the pod was being killed
- Nginx hit a timeout waiting for the backend
The message reset before headers means TCP connected but the app never sent an HTTP response.
Step 1: Check the Ingress and Service Mapping
kubectl describe ingress my-app-ingress -n productionLook at the Rules section. Verify the service name and port match exactly. A common mistake is using the container port in the Ingress instead of the Service port.
kubectl get svc my-app-service -n production -o yamlThe port field is what Ingress should reference. The targetPort is what the pod actually listens on.
Step 2: Check Endpoints — Is Traffic Actually Reaching Pods?
kubectl get endpoints my-app-service -n productionExpected output:
NAME ENDPOINTS AGE
my-app-service 10.0.1.5:8080,10.0.1.6:8080 12m
If you see <none> — your Service has no healthy pods. The pod selector in the Service does not match pod labels, or pods are not passing readiness checks.
kubectl get pods -n production -l app=my-app --show-labels
kubectl describe pod <pod-name> -n production | grep -A 10 "Readiness"Fix the label mismatch or fix the readiness probe.
Step 3: Check Nginx Ingress Controller Logs
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=100 | grep -i "error\|upstream\|connect"Real log lines you might see:
2026/06/28 10:23:45 [error] 1234#1234: *5678 connect() failed (111: Connection refused) while connecting to upstream
2026/06/28 10:23:46 [warn] upstream server temporarily disabled while reading response header from upstream
Connection refused (111) means the pod port is wrong. temporarily disabled means the upstream health check is failing repeatedly.
Step 4: Fix Connection Timeouts with Annotations
If the backend is slow (cold start, heavy computation), Nginx times out before the app responds. Add these annotations to your Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
nginx.ingress.kubernetes.io/proxy-send-timeout: "120"
nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
ingressClassName: nginx
rules:
- host: api.myapp.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-service
port:
number: 8080Default timeout is 60s for read. If your backend does DB queries or heavy processing, bump proxy-read-timeout to 120–300s.
Step 5: Fix Keepalive / Connection Reset Issues
If you see reset before headers specifically on the second request to the same upstream, it is a keepalive mismatch. The upstream closed a reused connection while Nginx was still sending a request.
Add this annotation to enable upstream keepalive properly:
nginx.ingress.kubernetes.io/upstream-keepalive-connections: "32"
nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"Or if the backend does not support keepalive at all (some older Go/Python apps), disable it:
nginx.ingress.kubernetes.io/upstream-keepalive-connections: "0"Step 6: Fix Session Affinity / Upstream Hashing
If you have a stateful app where a user must always hit the same pod (websockets, session store), and pods are load-balanced randomly, you will see intermittent resets. Fix with upstream hashing:
nginx.ingress.kubernetes.io/upstream-hash-by: "$remote_addr"Or use cookie-based affinity:
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "INGRESSCOOKIE"
nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"Step 7: Verify Pod Readiness Probe Is Correct
A pod that fails its readiness probe gets removed from endpoints — Nginx gets an empty upstream pool and returns 502.
kubectl describe pod <pod-name> -n production | grep -A 15 "Readiness:"If the readiness probe is hitting the wrong path or port, fix it:
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3Set initialDelaySeconds high enough for your app to fully start. A Spring Boot app can take 30–60 seconds.
Quick Diagnostic Checklist
Run these in order to isolate the cause in under 5 minutes:
# 1. Check endpoints exist
kubectl get endpoints my-app-service -n production
# 2. Test pod directly (bypass ingress)
kubectl port-forward pod/<pod-name> 9090:8080 -n production
curl http://localhost:9090/healthz
# 3. Check ingress controller logs
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller --tail=50
# 4. Check ingress events
kubectl describe ingress my-app-ingress -n production | grep -A 5 "Events"If port-forward works but Ingress does not, the issue is in the Ingress config or Nginx configuration — not your app. If port-forward also fails, the bug is inside the application itself.
Summary
| Symptom | Root Cause | Fix |
|---|---|---|
<none> endpoints | Label mismatch or readiness failing | Fix selectors or probe |
Connection refused in logs | Wrong service port | Match service port in Ingress |
| Timeout on slow requests | Default 60s too short | Add proxy-read-timeout annotation |
| Reset on 2nd request | Keepalive mismatch | Set upstream-keepalive-connections |
| Intermittent 502 on stateful app | Load balancing across pods | Add upstream-hash-by or affinity cookie |
Most upstream connect error issues are either a wrong port or a failing readiness probe. Start with kubectl get endpoints — if that returns <none>, everything else is secondary.
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS RDS Connection Timeout from EKS Pods — How to Fix It
EKS pods can't connect to RDS? Fix RDS connection timeouts from Kubernetes — covers security groups, VPC peering, subnet routing, and IAM auth issues.
Ingress-NGINX Is Being Retired: How to Migrate to Gateway API Before It Breaks
Ingress-NGINX is officially being retired. Your ingress rules will stop working. Here's the step-by-step migration plan to Kubernetes Gateway API before it's too late.
Kubernetes DNS Not Working: How to Fix CoreDNS Failures in Production
Pods can't resolve hostnames? Getting NXDOMAIN or 'no such host' errors? Here's how to diagnose and fix CoreDNS issues in Kubernetes step by step.