Kubernetes DNS Lookup Failing Inside Pods — Fix Guide (2026)
Pod can't resolve service names or external domains? DNS failures inside Kubernetes pods are caused by CoreDNS issues, ndots config, search domains, or network policies. Here's how to debug and fix each.
DNS failures inside pods are one of the most frustrating Kubernetes issues because the pod is running fine but can't connect to anything. Here's the systematic fix.
Quick Diagnosis
First, confirm it's a DNS issue:
# Run a debug pod
kubectl run dns-test --image=busybox:1.28 --restart=Never -- sleep 3600
# Test DNS resolution
kubectl exec dns-test -- nslookup kubernetes.default
kubectl exec dns-test -- nslookup my-service.my-namespace.svc.cluster.local
kubectl exec dns-test -- nslookup google.com
# Check which DNS server the pod is using
kubectl exec dns-test -- cat /etc/resolv.confExpected /etc/resolv.conf:
nameserver 10.96.0.10 # CoreDNS ClusterIP
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
If nameserver is wrong, or if nslookup hangs/fails, keep reading.
Fix 1: CoreDNS Is Down or Crashing
# Check CoreDNS pods
kubectl get pods -n kube-system -l k8s-app=kube-dns
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# Restart CoreDNS if it's crashing
kubectl rollout restart deployment/coredns -n kube-systemCommon CoreDNS error — "no route to host": This usually means the CoreDNS pods are unhealthy. Check node resources — if nodes are under memory pressure, CoreDNS gets evicted.
# Check CoreDNS resource usage
kubectl top pods -n kube-system -l k8s-app=kube-dns
# Check if CoreDNS has enough resources
kubectl describe deployment coredns -n kube-system | grep -A10 "Resources:"If CoreDNS is OOMKilled, increase its memory limit:
kubectl edit deployment coredns -n kube-system
# Increase memory limit from 170Mi to 300Mi or moreFix 2: Network Policy Blocking DNS
If you have NetworkPolicies, they might be blocking UDP/TCP port 53 to CoreDNS.
# Check if NetworkPolicies exist in your namespace
kubectl get networkpolicy -n your-namespaceAdd a DNS egress rule:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: your-namespace
spec:
podSelector: {}
policyTypes:
- Egress
egress:
# Allow DNS
- ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
# Allow communication to kube-dns service
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53Fix 3: ndots Setting Causing Slow DNS
options ndots:5 means Kubernetes tries 5 different search domain combinations before making an external DNS query. This causes 5x latency for external domains.
Symptom: Internal service DNS works fine, external DNS is very slow.
Fix — set ndots per pod:
spec:
dnsConfig:
options:
- name: ndots
value: "2" # reduce from 5 to 2
- name: single-request-reopenOr globally in CoreDNS configmap:
kubectl edit configmap coredns -n kube-systemFix 4: Wrong Service Name Format
Kubernetes has a specific DNS format. If you're using the wrong format, DNS will fail.
# Within the same namespace: just the service name
curl http://my-service
# Cross-namespace: service.namespace
curl http://my-service.other-namespace
# Full FQDN (works from anywhere)
curl http://my-service.my-namespace.svc.cluster.local
# ExternalName service (maps to external DNS)
curl http://my-external-service # resolves to external.example.com# Verify the service exists and has the right name
kubectl get svc -n my-namespace
kubectl get endpoints -n my-namespace my-service
# If endpoints are empty, the service selector doesn't match the pods
kubectl describe svc my-service -n my-namespace | grep Selector
kubectl get pods -n my-namespace --show-labelsFix 5: DNS Caching Issues (ndots + search domain loops)
# Check if CoreDNS has too many cache misses
kubectl exec -n kube-system <coredns-pod> -- \
wget -qO- localhost:9153/metrics | grep coredns_cacheIf you see high cache_misses_total, your DNS is being hammered. Add node-local DNS cache:
# NodeLocal DNSCache reduces DNS load significantly
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yamlNodeLocal DNSCache runs on every node and caches DNS responses locally, reducing load on CoreDNS by 80-90%.
Fix 6: CoreDNS ConfigMap Misconfiguration
kubectl get configmap coredns -n kube-system -o yamlCorrect Corefile should look like:
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
If forward is pointing to a broken upstream DNS, external DNS will fail. Check that /etc/resolv.conf on the nodes has working DNS servers.
Full DNS Debug Script
#!/bin/bash
# Save as dns-debug.sh
NAMESPACE=${1:-default}
echo "=== CoreDNS Status ==="
kubectl get pods -n kube-system -l k8s-app=kube-dns
echo -e "\n=== CoreDNS Logs (last 20 lines) ==="
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=20
echo -e "\n=== DNS Test from debug pod ==="
kubectl run dns-debug-$$ --image=busybox:1.28 --restart=Never \
-n $NAMESPACE -- sh -c "
echo '--- /etc/resolv.conf ---'
cat /etc/resolv.conf
echo '--- Internal DNS ---'
nslookup kubernetes.default
echo '--- External DNS ---'
nslookup google.com
" 2>/dev/null
sleep 5
kubectl logs dns-debug-$$ -n $NAMESPACE
kubectl delete pod dns-debug-$$ -n $NAMESPACE --ignore-not-foundQuick summary:
- CoreDNS crashing → restart it, check resources
- NetworkPolicy → allow egress on port 53
- Slow external DNS → reduce ndots from 5 to 2
- Wrong service name → use
service.namespace.svc.cluster.local - High DNS load → deploy NodeLocal DNSCache
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AWS RDS Connection Timeout from EKS Pods — How to Fix It
EKS pods can't connect to RDS? Fix RDS connection timeouts from Kubernetes — covers security groups, VPC peering, subnet routing, and IAM auth issues.
Ingress-NGINX Is Being Retired: How to Migrate to Gateway API Before It Breaks
Ingress-NGINX is officially being retired. Your ingress rules will stop working. Here's the step-by-step migration plan to Kubernetes Gateway API before it's too late.
Kubernetes DNS Not Working: How to Fix CoreDNS Failures in Production
Pods can't resolve hostnames? Getting NXDOMAIN or 'no such host' errors? Here's how to diagnose and fix CoreDNS issues in Kubernetes step by step.