🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Nginx Ingress 'upstream connect error' Fix in Kubernetes

Debug and fix Nginx Ingress 502/504 errors caused by 'upstream connect error or disconnect/reset before headers' in Kubernetes. Step-by-step with real commands.

DevOpsBoys4 min read
Share:Tweet

You deployed your app, set up Nginx Ingress, and then your users hit a wall: 502 Bad Gateway or 504 Gateway Timeout with the cryptic message upstream connect error or disconnect/reset before headers. Here is exactly how to debug and fix it.

What This Error Actually Means

Nginx Ingress is a reverse proxy. When it shows this error, it means Nginx reached the backend pod but either:

  • The pod was not ready to accept connections
  • The pod closed the connection before responding
  • The service was pointing to the wrong port
  • A health check was failing and the pod was being killed
  • Nginx hit a timeout waiting for the backend

The message reset before headers means TCP connected but the app never sent an HTTP response.

Step 1: Check the Ingress and Service Mapping

bash
kubectl describe ingress my-app-ingress -n production

Look at the Rules section. Verify the service name and port match exactly. A common mistake is using the container port in the Ingress instead of the Service port.

bash
kubectl get svc my-app-service -n production -o yaml

The port field is what Ingress should reference. The targetPort is what the pod actually listens on.

Step 2: Check Endpoints — Is Traffic Actually Reaching Pods?

bash
kubectl get endpoints my-app-service -n production

Expected output:

NAME             ENDPOINTS                         AGE
my-app-service   10.0.1.5:8080,10.0.1.6:8080      12m

If you see <none> — your Service has no healthy pods. The pod selector in the Service does not match pod labels, or pods are not passing readiness checks.

bash
kubectl get pods -n production -l app=my-app --show-labels
kubectl describe pod <pod-name> -n production | grep -A 10 "Readiness"

Fix the label mismatch or fix the readiness probe.

Step 3: Check Nginx Ingress Controller Logs

bash
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=100 | grep -i "error\|upstream\|connect"

Real log lines you might see:

2026/06/28 10:23:45 [error] 1234#1234: *5678 connect() failed (111: Connection refused) while connecting to upstream
2026/06/28 10:23:46 [warn] upstream server temporarily disabled while reading response header from upstream

Connection refused (111) means the pod port is wrong. temporarily disabled means the upstream health check is failing repeatedly.

Step 4: Fix Connection Timeouts with Annotations

If the backend is slow (cold start, heavy computation), Nginx times out before the app responds. Add these annotations to your Ingress:

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "120"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
  ingressClassName: nginx
  rules:
    - host: api.myapp.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app-service
                port:
                  number: 8080

Default timeout is 60s for read. If your backend does DB queries or heavy processing, bump proxy-read-timeout to 120–300s.

Step 5: Fix Keepalive / Connection Reset Issues

If you see reset before headers specifically on the second request to the same upstream, it is a keepalive mismatch. The upstream closed a reused connection while Nginx was still sending a request.

Add this annotation to enable upstream keepalive properly:

yaml
nginx.ingress.kubernetes.io/upstream-keepalive-connections: "32"
nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"

Or if the backend does not support keepalive at all (some older Go/Python apps), disable it:

yaml
nginx.ingress.kubernetes.io/upstream-keepalive-connections: "0"

Step 6: Fix Session Affinity / Upstream Hashing

If you have a stateful app where a user must always hit the same pod (websockets, session store), and pods are load-balanced randomly, you will see intermittent resets. Fix with upstream hashing:

yaml
nginx.ingress.kubernetes.io/upstream-hash-by: "$remote_addr"

Or use cookie-based affinity:

yaml
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "INGRESSCOOKIE"
nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"

Step 7: Verify Pod Readiness Probe Is Correct

A pod that fails its readiness probe gets removed from endpoints — Nginx gets an empty upstream pool and returns 502.

bash
kubectl describe pod <pod-name> -n production | grep -A 15 "Readiness:"

If the readiness probe is hitting the wrong path or port, fix it:

yaml
readinessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 3

Set initialDelaySeconds high enough for your app to fully start. A Spring Boot app can take 30–60 seconds.

Quick Diagnostic Checklist

Run these in order to isolate the cause in under 5 minutes:

bash
# 1. Check endpoints exist
kubectl get endpoints my-app-service -n production
 
# 2. Test pod directly (bypass ingress)
kubectl port-forward pod/<pod-name> 9090:8080 -n production
curl http://localhost:9090/healthz
 
# 3. Check ingress controller logs
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller --tail=50
 
# 4. Check ingress events
kubectl describe ingress my-app-ingress -n production | grep -A 5 "Events"

If port-forward works but Ingress does not, the issue is in the Ingress config or Nginx configuration — not your app. If port-forward also fails, the bug is inside the application itself.

Summary

SymptomRoot CauseFix
<none> endpointsLabel mismatch or readiness failingFix selectors or probe
Connection refused in logsWrong service portMatch service port in Ingress
Timeout on slow requestsDefault 60s too shortAdd proxy-read-timeout annotation
Reset on 2nd requestKeepalive mismatchSet upstream-keepalive-connections
Intermittent 502 on stateful appLoad balancing across podsAdd upstream-hash-by or affinity cookie

Most upstream connect error issues are either a wrong port or a failing readiness probe. Start with kubectl get endpoints — if that returns <none>, everything else is secondary.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments