Datadog Agent Not Sending Metrics — Diagnosis and Fix Guide
Datadog dashboards show no data, hosts appear offline, or custom metrics aren't showing up. Here's how to systematically diagnose and fix Datadog agent issues on Kubernetes and VMs.
Your Datadog dashboard shows "No Data" or the host appears offline even though the agent is running. Here's how to find the actual problem fast.
Start With Agent Status
The agent status command shows everything that's working and broken:
On Kubernetes:
# Get the agent pod name
kubectl get pods -n datadog -l app=datadog
# Run status check
kubectl exec -n datadog <datadog-agent-pod> -- agent status
# Short version
kubectl exec -n datadog <datadog-agent-pod> -- agent status --flareOn Linux (VM/bare metal):
sudo datadog-agent status
# Or the older format:
sudo service datadog-agent statusLook for:
API Keys status: API key ending with XXXX: API key valid✅Collector: Running✅- Any section showing
ERRORSorCRITICAL❌
Fix 1 — API Key Invalid or Missing
The most common cause of no data.
# Check agent status for API key validation
kubectl exec -n datadog <pod> -- agent status | grep -A3 "API Keys"
# If invalid:
# API key ending with XXXX: API key invalidFix on Kubernetes:
# Verify the secret exists and has the right key
kubectl get secret datadog-secret -n datadog -o yaml
# The secret should have key 'api-key'
kubectl describe secret datadog-secret -n datadog
# Re-create if wrong
kubectl create secret generic datadog-secret \
--from-literal api-key=<YOUR_ACTUAL_API_KEY> \
-n datadog \
--dry-run=client -o yaml | kubectl apply -f -
# Restart the agent pods
kubectl rollout restart daemonset/datadog -n datadogFix on Linux:
grep api_key /etc/datadog-agent/datadog.yaml
# If wrong, update it
sudo sed -i 's/api_key: .*/api_key: YOUR_CORRECT_KEY/' /etc/datadog-agent/datadog.yaml
sudo systemctl restart datadog-agentFix 2 — Agent Can't Reach Datadog Servers
Network connectivity issues prevent metrics from being sent.
# Test connectivity from agent pod
kubectl exec -n datadog <pod> -- curl -v https://api.datadoghq.com/api/v1/validate \
-H "DD-API-KEY: $DD_API_KEY"
# Test DNS resolution
kubectl exec -n datadog <pod> -- nslookup api.datadoghq.com
# Check agent logs for connection errors
kubectl logs -n datadog <pod> | grep -E "error|failed|timeout|connection"Common connectivity fixes:
# If using Datadog EU site, set the site in values.yaml:
datadog:
site: datadoghq.eu # Default is datadoghq.com
# If behind a proxy:
datadog:
env:
- name: DD_PROXY_HTTPS
value: "http://proxy.company.com:3128"
- name: DD_PROXY_HTTP
value: "http://proxy.company.com:3128"Fix 3 — Agent Running but Metrics Missing in Dashboard
Agent appears online but specific metrics don't show up.
Check which checks are running:
kubectl exec -n datadog <pod> -- agent check <check-name>
# Example:
kubectl exec -n datadog <pod> -- agent check kubelet
kubectl exec -n datadog <pod> -- agent check dockerFor Kubernetes metrics specifically:
# kube-state-metrics must be deployed
kubectl get pods -n datadog | grep kube-state-metrics
kubectl get pods -A | grep kube-state-metrics
# If missing, deploy it:
helm upgrade datadog datadog/datadog \
--set datadog.kubeStateMetricsEnabled=trueFix 4 — Custom Metrics Not Appearing
If your application is sending metrics via DogStatsD but they don't show up:
# Verify DogStatsD is enabled
kubectl exec -n datadog <pod> -- agent status | grep -A5 "DogStatsD"
# Test DogStatsD from your app pod
kubectl exec -n <app-namespace> <app-pod> -- \
nc -u -w1 <datadog-agent-service-ip> 8125 <<< "test.metric:1|c|#env:test"Common DogStatsD configuration issues:
# In datadog helm values — ensure DogStatsD is configured correctly
datadog:
dogstatsd:
useHostPort: true # Required for apps outside datadog namespace
nonLocalTraffic: true
# In your app deployment — point to Datadog agent
env:
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP # Use node IP, not cluster IP
- name: DD_DOGSTATSD_PORT
value: "8125"Fix 5 — Cluster Agent Not Connecting
For Kubernetes, the Cluster Agent collects cluster-level metrics. If it's not running, you lose cluster metrics.
# Check cluster agent status
kubectl get pods -n datadog | grep cluster-agent
kubectl logs -n datadog <cluster-agent-pod>
# Check the agent DaemonSet can reach the cluster agent
kubectl exec -n datadog <daemonset-pod> -- agent status | grep -A5 "Cluster Agent"Fix communication between agent and cluster agent:
# In helm values, ensure cluster agent is enabled
clusterAgent:
enabled: true
token: <same-token-as-node-agent>
datadog:
clusterAgent:
enabled: trueFix 6 — Logs Collection Not Working
# Check if logs are collected
kubectl exec -n datadog <pod> -- agent status | grep -A10 "Logs Agent"
# Look for errors
kubectl logs -n datadog <pod> | grep "logs"Enable log collection:
datadog:
logs:
enabled: true
containerCollectAll: true # Collect all container logs
# Or use pod annotations for specific pods:
# ad.datadoghq.com/app.logs: '[{"source":"python","service":"my-app"}]'Quick Diagnosis Flowchart
Agent running? (kubectl get pods -n datadog)
│
No → Check DaemonSet tolerations, node selector
│
Yes → agent status
│
API key invalid? → Fix secret, restart agent
│
API key valid, no metrics → Check connectivity (curl api.datadoghq.com)
│
Connectivity OK → Check specific check (agent check <name>)
│
Check failing → Missing dependency (kube-state-metrics, etc.)
Useful Datadog Agent Commands
# Full status (most useful)
kubectl exec -n datadog <pod> -- agent status
# Test a specific integration check
kubectl exec -n datadog <pod> -- agent check nginx
# Send a test event
kubectl exec -n datadog <pod> -- agent flare
# Show configured checks
kubectl exec -n datadog <pod> -- agent configcheck
# Diagnose connectivity
kubectl exec -n datadog <pod> -- agent diagnoseFor observability and monitoring hands-on labs including Datadog and Prometheus, KodeKloud has DevOps monitoring courses with real scenario-based exercises.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Prometheus High Cardinality Causing OOM — How to Find and Fix It (2026)
Prometheus is crashing with OOMKilled or running out of memory. The culprit is almost always high cardinality metrics — labels with thousands of unique values. Here's how to find which metrics are killing your Prometheus and exactly how to fix it.
Prometheus Targets Showing 'Down' — Every Cause and Fix (2026)
Your Prometheus /targets page shows red. Services are running but Prometheus can't scrape them. Here's every reason this happens — wrong port, NetworkPolicy blocks, ServiceMonitor label mismatch, auth — and exactly how to fix each one.
Why Agentic AI Will Kill the Traditional On-Call Rotation by 2028
60% of enterprises now use AIOps self-healing. 83% of alerts auto-resolve without humans. The era of 2 AM PagerDuty wake-ups is ending. Here's what replaces it.