🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Grafana Tempo vs Jaeger vs Zipkin — Distributed Tracing Comparison (2026)

Choosing a distributed tracing backend in 2026. Grafana Tempo, Jaeger, and Zipkin all store and query traces — but they differ significantly in storage, scalability, and cost.

DevOpsBoysMay 26, 20264 min read
Share:Tweet

Distributed tracing shows you what happens inside a request as it flows through multiple services. In 2026, three open-source options dominate: Grafana Tempo, Jaeger, and Zipkin. They all store traces — but they're architected very differently.


Quick Comparison

Grafana TempoJaegerZipkin
Backed byGrafana LabsCNCF (originally Uber)OpenZipkin / Twitter origin
StorageObject storage (S3/GCS)Cassandra, Elasticsearch, BadgerMySQL, Elasticsearch, Cassandra
Cost at scaleVery low (S3 pricing)Higher (Cassandra/ES infra)Higher (ES infra)
UIGrafana (via plugin)Built-inBuilt-in
AlertingGrafana alertsLimitedNone
Protocol supportOTLP, Jaeger, Zipkin, OpenCensusOTLP, Jaeger, ZipkinZipkin, OTLP (newer)
Grafana integrationNativePluginPlugin
TraceQLYes (powerful query language)Basic filteringBasic
Kubernetes complexityLowMediumLow

Grafana Tempo

Tempo is the newest of the three, built specifically as a cost-efficient trace backend that stores traces in object storage (S3, GCS, Azure Blob).

Architecture

Applications → Tempo Distributor → Tempo Ingester → Object Storage (S3)
                                                          ↓
                               Grafana (TraceQL) → Tempo Querier

Tempo doesn't use Elasticsearch or Cassandra — it writes traces directly to S3 as Parquet/OTel files.

Key Strengths

Cost efficiency At 10M traces/day, S3 storage costs ~$5–15/month. Equivalent Elasticsearch storage costs $100–500+/month.

TraceQL Tempo's query language lets you search across all traces:

# Find all traces where checkout service took > 2 seconds
{ .service.name = "checkout" && duration > 2s }

# Find traces with errors on specific endpoint
{ .http.method = "POST" && .http.url =~ ".*checkout.*" && status = error }

# Find traces where payment service called db > 10 times
{ .service.name = "payment" } | count_over_time(1m) > 10

Grafana native integration Tempo was built by Grafana Labs. Integration with Grafana is first-class:

  • Traces in Grafana Explorer
  • Trace-to-logs correlation (click a trace → see Loki logs for that trace)
  • Trace-to-metrics correlation (click a trace → see related Prometheus metrics)
  • Alerts based on trace data

Low operational overhead Minimal configuration. No Cassandra cluster, no Elasticsearch cluster to maintain.

Weaknesses

  • No standalone UI — requires Grafana
  • Search requires tempodb index (adds some storage overhead)
  • Less mature than Jaeger for very complex deployments

Jaeger

Jaeger was built by Uber and donated to the CNCF. It's the most mature option and battle-tested at massive scale.

Architecture

Applications → Jaeger Agent → Jaeger Collector → Storage (Cassandra/ES)
                                                        ↓
                                              Jaeger Query → Jaeger UI

Key Strengths

Maturity and stability Jaeger has been running in production at Uber, Netflix, and others for years. Its edge-case behavior is well understood.

Built-in UI Jaeger has its own web UI — no Grafana required. Good for teams that don't want to set up Grafana.

Flexible storage Supports Cassandra (designed for scale), Elasticsearch, and a local in-memory Badger store for testing.

Service dependency graph Jaeger generates a real-time service dependency graph — which services call which, with error rates and latency overlays.

Weaknesses

  • Cassandra/Elasticsearch are expensive to operate
  • Storage costs scale poorly — 1TB of traces in ES is expensive
  • Less powerful query language than Tempo's TraceQL
  • Alert integration requires external tooling

Zipkin

Zipkin is the oldest of the three — it originated at Twitter. It's simpler but less feature-rich.

Architecture

Applications → Zipkin Collector → Storage (MySQL/ES/Cassandra)
                                         ↓
                              Zipkin UI → Zipkin API

Key Strengths

Simplicity Zipkin is the easiest to set up. A single Docker container runs the entire stack (with in-memory storage).

bash
docker run -d -p 9411:9411 openzipkin/zipkin

Broad language support Extensive libraries for Java, Python, Ruby, Node.js, Go — many frameworks auto-instrument for Zipkin.

Established format The Zipkin format (B3 headers) is widely supported. Many tools emit Zipkin-format traces natively.

Weaknesses

  • Least powerful query capabilities
  • No built-in alerting
  • Falling behind Jaeger and Tempo in feature development
  • Limited Grafana integration compared to Tempo

Deployment Comparison

Tempo on Kubernetes (simplest)

yaml
# values.yaml for Grafana Tempo
tempo:
  storage:
    trace:
      backend: s3
      s3:
        bucket: my-traces-bucket
        region: us-east-1
  retention: 336h  # 14 days
bash
helm repo add grafana https://grafana.github.io/helm-charts
helm install tempo grafana/tempo-distributed \
  -n monitoring \
  -f values.yaml

Jaeger on Kubernetes

yaml
# Jaeger Operator install
kubectl create namespace observability
kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/latest/download/jaeger-operator.yaml -n observability
 
# Jaeger instance with Elasticsearch
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger
spec:
  strategy: production
  storage:
    type: elasticsearch
    elasticsearch:
      serverUrls: https://my-es-cluster:9200

Which to Choose

Choose Grafana Tempo if:

  • You already use Grafana for dashboards/alerts
  • Cost at scale matters (S3 is 10x cheaper than ES)
  • You want trace-to-logs and trace-to-metrics correlation
  • You're starting fresh in 2026

Choose Jaeger if:

  • You need a built-in UI without Grafana
  • You're running Cassandra already
  • You want the most battle-tested option
  • Your team knows Jaeger well

Choose Zipkin if:

  • You need the simplest possible setup for development
  • Your existing app stack already emits B3/Zipkin format traces
  • You don't need alerting or advanced queries

For new Kubernetes-native setups in 2026: Grafana Tempo is the default recommendation. The S3-native storage model, TraceQL, and Grafana integration make it the most modern and cost-efficient choice.


Sending Traces to Any Backend

All three support OpenTelemetry Protocol (OTLP). Instrument once, switch backends:

yaml
# OpenTelemetry Collector config
exporters:
  otlp/tempo:
    endpoint: tempo-distributor:4317
    tls:
      insecure: true
  # Or switch to Jaeger:
  jaeger:
    endpoint: jaeger-collector:14250
    tls:
      insecure: true
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/tempo]

Related: OpenTelemetry Complete Guide | Prometheus vs Datadog vs New Relic | Grafana Loki Log Aggregation Guide

Affiliate note: Grafana Cloud includes hosted Tempo, Loki, and Prometheus with a generous free tier (50GB traces/month free). Best way to try all three integrations without running infrastructure.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments