Run Langfuse on Kubernetes for LLM Observability (2026 Guide)

Deploy Langfuse on Kubernetes to get complete tracing, cost tracking, and evaluation for your LLM applications. Step-by-step guide with Helm charts, Postgres, ClickHouse, and production configuration.

You're running LLMs in production — but when a response is wrong, slow, or expensive, you have no visibility into why. Langfuse is the open-source LLM observability platform that gives you traces, cost tracking, evaluation, and prompt versioning. This guide deploys it on Kubernetes.

What Langfuse Tracks

Traces: Every LLM call with inputs, outputs, latency, token counts
Cost: Per-call and aggregate spend by model
Evaluations: Run automated or human scoring on outputs
Prompts: Version-controlled prompt management with A/B testing
Sessions: Full conversation traces for chatbots

Think of it as Datadog but built specifically for LLM applications.

Architecture

Your App → Langfuse SDK → Langfuse Server → PostgreSQL (metadata)
                                          → ClickHouse (events/traces)
                                          → Redis (queue)
                          Langfuse Worker processes events asynchronously

Langfuse has two main components:

Server: Web UI + API (ingests traces, serves dashboard)
Worker: Background processor (computes costs, runs evaluations)

Prerequisites

Kubernetes cluster (any: EKS, GKE, k3s, minikube)
Helm 3+
At least 2 CPU / 4GB RAM available (more for production)
A domain or LoadBalancer IP for the UI

Step 1: Create Namespace and Secrets

bash

kubectl create namespace langfuse
 
# Generate secrets
NEXTAUTH_SECRET=$(openssl rand -base64 32)
SALT=$(openssl rand -base64 32)
 
kubectl create secret generic langfuse-secrets \
  --namespace langfuse \
  --from-literal=nextauth-secret="$NEXTAUTH_SECRET" \
  --from-literal=salt="$SALT" \
  --from-literal=database-url="postgresql://langfuse:password@langfuse-postgres:5432/langfuse" \
  --from-literal=clickhouse-password="clickhousepassword" \
  --from-literal=redis-password="redispassword"

Step 2: Deploy PostgreSQL

yaml

# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: langfuse-postgres
  namespace: langfuse
spec:
  replicas: 1
  selector:
    matchLabels:
      app: langfuse-postgres
  template:
    metadata:
      labels:
        app: langfuse-postgres
    spec:
      containers:
      - name: postgres
        image: postgres:16-alpine
        env:
        - name: POSTGRES_DB
          value: langfuse
        - name: POSTGRES_USER
          value: langfuse
        - name: POSTGRES_PASSWORD
          value: password
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: pgdata
          mountPath: /var/lib/postgresql/data
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
      volumes:
      - name: pgdata
        persistentVolumeClaim:
          claimName: langfuse-postgres-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: langfuse-postgres-pvc
  namespace: langfuse
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
---
apiVersion: v1
kind: Service
metadata:
  name: langfuse-postgres
  namespace: langfuse
spec:
  selector:
    app: langfuse-postgres
  ports:
  - port: 5432
    targetPort: 5432

bash

kubectl apply -f postgres.yaml

Step 3: Deploy ClickHouse

Langfuse v3+ uses ClickHouse for high-performance event storage.

yaml

# clickhouse.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: langfuse-clickhouse
  namespace: langfuse
spec:
  replicas: 1
  selector:
    matchLabels:
      app: langfuse-clickhouse
  template:
    metadata:
      labels:
        app: langfuse-clickhouse
    spec:
      containers:
      - name: clickhouse
        image: clickhouse/clickhouse-server:24.3-alpine
        env:
        - name: CLICKHOUSE_DB
          value: langfuse
        - name: CLICKHOUSE_USER
          value: langfuse
        - name: CLICKHOUSE_PASSWORD
          value: clickhousepassword
        ports:
        - containerPort: 8123
          name: http
        - containerPort: 9000
          name: native
        volumeMounts:
        - name: chdata
          mountPath: /var/lib/clickhouse
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
      volumes:
      - name: chdata
        persistentVolumeClaim:
          claimName: langfuse-clickhouse-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: langfuse-clickhouse-pvc
  namespace: langfuse
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
---
apiVersion: v1
kind: Service
metadata:
  name: langfuse-clickhouse
  namespace: langfuse
spec:
  selector:
    app: langfuse-clickhouse
  ports:
  - name: http
    port: 8123
    targetPort: 8123
  - name: native
    port: 9000
    targetPort: 9000

bash

kubectl apply -f clickhouse.yaml

Step 4: Deploy Redis

yaml

# redis.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: langfuse-redis
  namespace: langfuse
spec:
  replicas: 1
  selector:
    matchLabels:
      app: langfuse-redis
  template:
    metadata:
      labels:
        app: langfuse-redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        args: ["--requirepass", "redispassword"]
        ports:
        - containerPort: 6379
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
---
apiVersion: v1
kind: Service
metadata:
  name: langfuse-redis
  namespace: langfuse
spec:
  selector:
    app: langfuse-redis
  ports:
  - port: 6379
    targetPort: 6379

Step 5: Deploy Langfuse Server + Worker

yaml

# langfuse.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: langfuse-server
  namespace: langfuse
spec:
  replicas: 2
  selector:
    matchLabels:
      app: langfuse-server
  template:
    metadata:
      labels:
        app: langfuse-server
    spec:
      containers:
      - name: langfuse
        image: langfuse/langfuse:latest
        ports:
        - containerPort: 3000
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: langfuse-secrets
              key: database-url
        - name: NEXTAUTH_SECRET
          valueFrom:
            secretKeyRef:
              name: langfuse-secrets
              key: nextauth-secret
        - name: SALT
          valueFrom:
            secretKeyRef:
              name: langfuse-secrets
              key: salt
        - name: NEXTAUTH_URL
          value: "https://langfuse.yourdomain.com"
        - name: CLICKHOUSE_URL
          value: "http://langfuse-clickhouse:8123"
        - name: CLICKHOUSE_USER
          value: langfuse
        - name: CLICKHOUSE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: langfuse-secrets
              key: clickhouse-password
        - name: REDIS_HOST
          value: langfuse-redis
        - name: REDIS_PORT
          value: "6379"
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: langfuse-secrets
              key: redis-password
        - name: LANGFUSE_INIT_ORG_NAME
          value: "My Team"
        - name: LANGFUSE_INIT_USER_EMAIL
          value: "admin@yourdomain.com"
        - name: LANGFUSE_INIT_USER_PASSWORD
          value: "changeme123"
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        readinessProbe:
          httpGet:
            path: /api/public/health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: langfuse-server
  namespace: langfuse
spec:
  selector:
    app: langfuse-server
  ports:
  - port: 80
    targetPort: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: langfuse-ingress
  namespace: langfuse
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
spec:
  ingressClassName: nginx
  rules:
  - host: langfuse.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: langfuse-server
            port:
              number: 80

bash

kubectl apply -f langfuse.yaml
 
# Check all pods are running
kubectl get pods -n langfuse

Step 6: Instrument Your LLM App

Install the Langfuse SDK in your application:

bash

pip install langfuse openai

python

from langfuse import Langfuse
from langfuse.openai import openai  # Drop-in replacement for openai client
 
langfuse = Langfuse(
    public_key="pk-lf-...",      # From Langfuse dashboard
    secret_key="sk-lf-...",      # From Langfuse dashboard
    host="https://langfuse.yourdomain.com"
)
 
# This OpenAI client automatically traces all calls
client = openai.OpenAI(api_key="sk-...")
 
def answer_devops_question(question: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a DevOps expert."},
            {"role": "user", "content": question}
        ],
        name="devops-qa",  # Shows up in Langfuse traces
    )
    return response.choices[0].message.content
 
result = answer_devops_question("What is a Kubernetes PodDisruptionBudget?")

Every call is now automatically traced in Langfuse with:

Input/output text
Token counts
Latency
Cost (auto-calculated per model pricing)

Step 7: Create Evaluations

python

# Score your outputs manually or with an LLM judge
from langfuse import Langfuse
 
langfuse = Langfuse(...)
 
# After getting a trace ID, add a score
langfuse.score(
    trace_id="trace-id-from-response",
    name="quality",
    value=0.9,  # 0-1 scale
    comment="Good explanation, could have included examples"
)

For automated evaluation, set up LLM-as-judge:

python

from langfuse.decorators import observe, langfuse_context
 
@observe()
def evaluate_response(response: str, criteria: str) -> float:
    judge_response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Rate this response 0-1 for {criteria}:\n{response}"
        }]
    )
    return float(judge_response.choices[0].message.content.strip())

What to Monitor in Production

Once running, watch these in the Langfuse dashboard:

Metric	What to Watch For
Latency p95	Spikes indicate slow model or network issues
Error rate	API errors, context length exceeded, rate limits
Cost per trace	Sudden spikes = someone passing huge contexts
Token usage	Input vs output token ratio
Quality scores	Trend down = prompt or model regression

Learn More

Langfuse is rapidly becoming the standard for LLM observability in production. Check the Langfuse documentation for advanced features like prompt management and dataset evaluation. For building production LLM applications end-to-end, LLM Engineering on Udemy covers observability patterns in depth.

With Langfuse on Kubernetes, you finally have production-grade visibility into what your LLM applications are actually doing.

Run Langfuse on Kubernetes for LLM Observability (2026 Guide)

What Langfuse Tracks

Architecture

Prerequisites

Step 1: Create Namespace and Secrets

Step 2: Deploy PostgreSQL

Step 3: Deploy ClickHouse

Step 4: Deploy Redis

Step 5: Deploy Langfuse Server + Worker

Step 6: Instrument Your LLM App

Step 7: Create Evaluations

What to Monitor in Production

Learn More

Stay ahead of the curve

Related Articles

AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds

Build an AI Kubernetes Runbook Generator with LLMs (2026)

Build an AI-Powered SLO Breach Predictor with Claude and Prometheus

Comments