Run Langfuse on Kubernetes for LLM Observability (2026 Guide)
Deploy Langfuse on Kubernetes to get complete tracing, cost tracking, and evaluation for your LLM applications. Step-by-step guide with Helm charts, Postgres, ClickHouse, and production configuration.
You're running LLMs in production — but when a response is wrong, slow, or expensive, you have no visibility into why. Langfuse is the open-source LLM observability platform that gives you traces, cost tracking, evaluation, and prompt versioning. This guide deploys it on Kubernetes.
What Langfuse Tracks
- Traces: Every LLM call with inputs, outputs, latency, token counts
- Cost: Per-call and aggregate spend by model
- Evaluations: Run automated or human scoring on outputs
- Prompts: Version-controlled prompt management with A/B testing
- Sessions: Full conversation traces for chatbots
Think of it as Datadog but built specifically for LLM applications.
Architecture
Your App → Langfuse SDK → Langfuse Server → PostgreSQL (metadata)
→ ClickHouse (events/traces)
→ Redis (queue)
Langfuse Worker processes events asynchronously
Langfuse has two main components:
- Server: Web UI + API (ingests traces, serves dashboard)
- Worker: Background processor (computes costs, runs evaluations)
Prerequisites
- Kubernetes cluster (any: EKS, GKE, k3s, minikube)
- Helm 3+
- At least 2 CPU / 4GB RAM available (more for production)
- A domain or LoadBalancer IP for the UI
Step 1: Create Namespace and Secrets
kubectl create namespace langfuse
# Generate secrets
NEXTAUTH_SECRET=$(openssl rand -base64 32)
SALT=$(openssl rand -base64 32)
kubectl create secret generic langfuse-secrets \
--namespace langfuse \
--from-literal=nextauth-secret="$NEXTAUTH_SECRET" \
--from-literal=salt="$SALT" \
--from-literal=database-url="postgresql://langfuse:password@langfuse-postgres:5432/langfuse" \
--from-literal=clickhouse-password="clickhousepassword" \
--from-literal=redis-password="redispassword"Step 2: Deploy PostgreSQL
# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: langfuse-postgres
namespace: langfuse
spec:
replicas: 1
selector:
matchLabels:
app: langfuse-postgres
template:
metadata:
labels:
app: langfuse-postgres
spec:
containers:
- name: postgres
image: postgres:16-alpine
env:
- name: POSTGRES_DB
value: langfuse
- name: POSTGRES_USER
value: langfuse
- name: POSTGRES_PASSWORD
value: password
ports:
- containerPort: 5432
volumeMounts:
- name: pgdata
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: pgdata
persistentVolumeClaim:
claimName: langfuse-postgres-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: langfuse-postgres-pvc
namespace: langfuse
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: v1
kind: Service
metadata:
name: langfuse-postgres
namespace: langfuse
spec:
selector:
app: langfuse-postgres
ports:
- port: 5432
targetPort: 5432kubectl apply -f postgres.yamlStep 3: Deploy ClickHouse
Langfuse v3+ uses ClickHouse for high-performance event storage.
# clickhouse.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: langfuse-clickhouse
namespace: langfuse
spec:
replicas: 1
selector:
matchLabels:
app: langfuse-clickhouse
template:
metadata:
labels:
app: langfuse-clickhouse
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:24.3-alpine
env:
- name: CLICKHOUSE_DB
value: langfuse
- name: CLICKHOUSE_USER
value: langfuse
- name: CLICKHOUSE_PASSWORD
value: clickhousepassword
ports:
- containerPort: 8123
name: http
- containerPort: 9000
name: native
volumeMounts:
- name: chdata
mountPath: /var/lib/clickhouse
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
volumes:
- name: chdata
persistentVolumeClaim:
claimName: langfuse-clickhouse-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: langfuse-clickhouse-pvc
namespace: langfuse
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
---
apiVersion: v1
kind: Service
metadata:
name: langfuse-clickhouse
namespace: langfuse
spec:
selector:
app: langfuse-clickhouse
ports:
- name: http
port: 8123
targetPort: 8123
- name: native
port: 9000
targetPort: 9000kubectl apply -f clickhouse.yamlStep 4: Deploy Redis
# redis.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: langfuse-redis
namespace: langfuse
spec:
replicas: 1
selector:
matchLabels:
app: langfuse-redis
template:
metadata:
labels:
app: langfuse-redis
spec:
containers:
- name: redis
image: redis:7-alpine
args: ["--requirepass", "redispassword"]
ports:
- containerPort: 6379
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
---
apiVersion: v1
kind: Service
metadata:
name: langfuse-redis
namespace: langfuse
spec:
selector:
app: langfuse-redis
ports:
- port: 6379
targetPort: 6379Step 5: Deploy Langfuse Server + Worker
# langfuse.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: langfuse-server
namespace: langfuse
spec:
replicas: 2
selector:
matchLabels:
app: langfuse-server
template:
metadata:
labels:
app: langfuse-server
spec:
containers:
- name: langfuse
image: langfuse/langfuse:latest
ports:
- containerPort: 3000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: langfuse-secrets
key: database-url
- name: NEXTAUTH_SECRET
valueFrom:
secretKeyRef:
name: langfuse-secrets
key: nextauth-secret
- name: SALT
valueFrom:
secretKeyRef:
name: langfuse-secrets
key: salt
- name: NEXTAUTH_URL
value: "https://langfuse.yourdomain.com"
- name: CLICKHOUSE_URL
value: "http://langfuse-clickhouse:8123"
- name: CLICKHOUSE_USER
value: langfuse
- name: CLICKHOUSE_PASSWORD
valueFrom:
secretKeyRef:
name: langfuse-secrets
key: clickhouse-password
- name: REDIS_HOST
value: langfuse-redis
- name: REDIS_PORT
value: "6379"
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: langfuse-secrets
key: redis-password
- name: LANGFUSE_INIT_ORG_NAME
value: "My Team"
- name: LANGFUSE_INIT_USER_EMAIL
value: "admin@yourdomain.com"
- name: LANGFUSE_INIT_USER_PASSWORD
value: "changeme123"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
readinessProbe:
httpGet:
path: /api/public/health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: langfuse-server
namespace: langfuse
spec:
selector:
app: langfuse-server
ports:
- port: 80
targetPort: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: langfuse-ingress
namespace: langfuse
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
spec:
ingressClassName: nginx
rules:
- host: langfuse.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: langfuse-server
port:
number: 80kubectl apply -f langfuse.yaml
# Check all pods are running
kubectl get pods -n langfuseStep 6: Instrument Your LLM App
Install the Langfuse SDK in your application:
pip install langfuse openaifrom langfuse import Langfuse
from langfuse.openai import openai # Drop-in replacement for openai client
langfuse = Langfuse(
public_key="pk-lf-...", # From Langfuse dashboard
secret_key="sk-lf-...", # From Langfuse dashboard
host="https://langfuse.yourdomain.com"
)
# This OpenAI client automatically traces all calls
client = openai.OpenAI(api_key="sk-...")
def answer_devops_question(question: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a DevOps expert."},
{"role": "user", "content": question}
],
name="devops-qa", # Shows up in Langfuse traces
)
return response.choices[0].message.content
result = answer_devops_question("What is a Kubernetes PodDisruptionBudget?")Every call is now automatically traced in Langfuse with:
- Input/output text
- Token counts
- Latency
- Cost (auto-calculated per model pricing)
Step 7: Create Evaluations
# Score your outputs manually or with an LLM judge
from langfuse import Langfuse
langfuse = Langfuse(...)
# After getting a trace ID, add a score
langfuse.score(
trace_id="trace-id-from-response",
name="quality",
value=0.9, # 0-1 scale
comment="Good explanation, could have included examples"
)For automated evaluation, set up LLM-as-judge:
from langfuse.decorators import observe, langfuse_context
@observe()
def evaluate_response(response: str, criteria: str) -> float:
judge_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": f"Rate this response 0-1 for {criteria}:\n{response}"
}]
)
return float(judge_response.choices[0].message.content.strip())What to Monitor in Production
Once running, watch these in the Langfuse dashboard:
| Metric | What to Watch For |
|---|---|
| Latency p95 | Spikes indicate slow model or network issues |
| Error rate | API errors, context length exceeded, rate limits |
| Cost per trace | Sudden spikes = someone passing huge contexts |
| Token usage | Input vs output token ratio |
| Quality scores | Trend down = prompt or model regression |
Learn More
Langfuse is rapidly becoming the standard for LLM observability in production. Check the Langfuse documentation for advanced features like prompt management and dataset evaluation. For building production LLM applications end-to-end, LLM Engineering on Udemy covers observability patterns in depth.
With Langfuse on Kubernetes, you finally have production-grade visibility into what your LLM applications are actually doing.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
AI-Driven Capacity Planning for Kubernetes Clusters (2026)
How to use AI and machine learning for Kubernetes capacity planning. Covers predictive autoscaling, cost optimization, tools like StormForge and Kubecost, and building custom ML models for resource forecasting.
AI-Powered Log Analysis Is Replacing Manual Debugging in DevOps (2026)
How LLMs and AI are transforming log analysis, anomaly detection, and root cause analysis — and the tools DevOps engineers should know about in 2026.