🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

What Is OpenTelemetry? Observability Standard Explained Simply

OpenTelemetry (OTel) is the open standard for collecting traces, metrics, and logs. Learn what it is, why it matters, and how to start using it.

DevOpsBoysMay 27, 20264 min read
Share:Tweet

Every time you look at a Grafana dashboard or chase a slow API request across 10 microservices, you're depending on observability data. OpenTelemetry is the standard that collects that data — without locking you into any vendor.


The Problem Before OpenTelemetry

Before OTel, if you wanted traces, you'd use Jaeger SDK. For metrics, Prometheus client. For logs, maybe Fluentd. Each service needed a different SDK, different config, different agent.

If you switched from Jaeger to Zipkin, you rewrote all your instrumentation code.

OpenTelemetry solves this by being a single, vendor-neutral SDK that collects traces, metrics, and logs, and exports to any backend you choose.


What Is OpenTelemetry?

OpenTelemetry (OTel) is:

  1. A standard — defines how observability data is collected and formatted
  2. An SDK — libraries for every major language to instrument your code
  3. A Collector — an agent/proxy that receives, processes, and exports telemetry
  4. A protocol — OTLP (OpenTelemetry Protocol) for sending data

It's maintained by CNCF (same org as Kubernetes, Prometheus, Helm).


The Three Pillars

Traces

A trace follows a request as it moves through your system.

User request → API Gateway → Auth Service → User Service → DB
     |               |              |             |          |
  Span 1         Span 2         Span 3        Span 4    Span 5
  50ms total     5ms           15ms           20ms       8ms

Each step is a span. All spans with the same trace ID form a trace. This is how you find which service is slow.

Metrics

Numerical measurements over time: CPU usage, request count, error rate, latency histogram.

http_requests_total{method="GET", status="200"} 1523
http_request_duration_seconds{p99} 0.234

Logs

Text records of events. OTel can attach trace IDs to logs so you can correlate "this log line came from this trace."


How It Works

Your App (with OTel SDK)
        ↓
OpenTelemetry Collector
        ↓
Backends:
  - Traces → Jaeger / Tempo / Datadog
  - Metrics → Prometheus / Datadog / New Relic
  - Logs → Loki / Elasticsearch / Datadog

The Collector is the key piece — it decouples your app from backends. Change backend? Update Collector config, not app code.


Quick Start: Instrument a Python App

bash
pip install opentelemetry-sdk \
            opentelemetry-instrumentation-fastapi \
            opentelemetry-exporter-otlp
python
# main.py
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from fastapi import FastAPI
 
# Set up tracing
provider = TracerProvider()
processor = BatchSpanProcessor(
    OTLPSpanExporter(endpoint="http://otel-collector:4317")
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
 
app = FastAPI()
FastAPIInstrumentor.instrument_app(app)  # Auto-instruments all routes
 
tracer = trace.get_tracer(__name__)
 
@app.get("/users/{user_id}")
def get_user(user_id: str):
    with tracer.start_as_current_span("get-user-from-db") as span:
        span.set_attribute("user.id", user_id)
        # ... your code
        return {"id": user_id}

FastAPIInstrumentor automatically creates spans for every request — you don't have to manually wrap each endpoint.


The OpenTelemetry Collector

The Collector runs as a sidecar or standalone deployment:

yaml
# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
 
processors:
  batch:
    timeout: 1s
  resource:
    attributes:
    - key: service.environment
      value: production
      action: insert
 
exporters:
  jaeger:
    endpoint: jaeger:14250
  prometheus:
    endpoint: "0.0.0.0:8889"
  loki:
    endpoint: http://loki:3100/loki/api/v1/push
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, resource]
      exporters: [jaeger]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [loki]

One Collector config routes all telemetry to the right backends.


Deploy Collector on Kubernetes

yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-collector
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: otel-collector
  template:
    spec:
      containers:
      - name: collector
        image: otel/opentelemetry-collector-contrib:latest
        ports:
        - containerPort: 4317  # OTLP gRPC
        - containerPort: 4318  # OTLP HTTP
        volumeMounts:
        - name: config
          mountPath: /etc/otelcol
      volumes:
      - name: config
        configMap:
          name: otel-collector-config

DaemonSet = one collector per node, so pods send data to local collector without network hops.


Auto-Instrumentation on Kubernetes

The OTel Operator can auto-instrument pods without code changes:

bash
# Install the operator
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
yaml
# Annotate your deployment to enable auto-instrumentation
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  annotations:
    instrumentation.opentelemetry.io/inject-python: "true"  # or java, nodejs, dotnet

The operator injects the OTel SDK as an init container — your app gets tracing without any code changes.


OTel vs Prometheus

OpenTelemetryPrometheus
Data typesTraces + Metrics + LogsMetrics only
CollectionPush (OTLP)Pull (scrape)
Language SDKsAll major languagesClient libraries
BackendAny (via exporters)Prometheus + Grafana
AdoptionGrowing fastEstablished

They're complementary. Many teams use Prometheus for metrics but OTel for traces and logs. OTel can also export metrics in Prometheus format.


Summary

OTel SDK  → instruments your code (traces, metrics, logs)
OTel Collector → receives, processes, exports
OTLP  → the wire protocol between them
Backends → where data lives (Jaeger, Tempo, Prometheus, Loki)

OpenTelemetry is becoming the standard way to instrument cloud-native apps. If you're setting up a new service in 2026, start with OTel — you'll be able to switch backends without touching your app code.


KodeKloud Observability Course — covers Prometheus, Grafana, OpenTelemetry, and distributed tracing with hands-on labs.

Grafana Cloud — free tier that accepts OTLP data natively. The fastest way to see your traces and metrics without running your own backends.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments