KEDA: The Complete Guide to Kubernetes Event-Driven Autoscaling (2026)
KEDA lets Kubernetes scale workloads based on any external event source — Kafka, RabbitMQ, SQS, Redis, HTTP, and 60+ more. This guide covers architecture, installation, and real-world ScaledObject examples.
Kubernetes has built-in autoscaling — the Horizontal Pod Autoscaler (HPA) scales workloads based on CPU and memory. But CPU and memory are proxies. What you actually care about is your workload's real demand signal.
How many messages are in your Kafka queue? How long is your SQS backlog? How many items are waiting in RabbitMQ?
KEDA — Kubernetes Event Driven Autoscaler — answers those questions and scales your pods accordingly.
What Is KEDA?
KEDA is a CNCF graduated project that extends Kubernetes with event-driven autoscaling. It adds two things:
- A custom
ScaledObjectresource — you define what external metric to watch - 60+ built-in scalers — pre-built integrations for every major event source
With KEDA, you can scale:
- A consumer deployment based on Kafka consumer group lag
- A worker pool based on RabbitMQ queue depth
- A job runner based on SQS queue length
- An API service based on Redis list length
- A batch processor based on PostgreSQL row count
- Any workload based on a Prometheus query result
And critically — KEDA can scale to zero. When there are no messages to process, it scales your deployment to 0 pods. When messages arrive, it scales back up instantly.
KEDA Architecture
┌─────────────────────────────────────────────────────────────┐
│ KEDA Architecture │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ ScaledObject │────▶│ KEDA Operator│────▶│ HPA │ │
│ │ (your config)│ │ │ │ (K8s built-in│ │
│ └──────────────┘ └──────┬───────┘ └────────────┘ │
│ │ │
│ ┌─────────▼─────────┐ │
│ │ External Metrics │ │
│ │ Server │ │
│ └─────────┬─────────┘ │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ ▼ ▼ ▼ │
│ Kafka RabbitMQ AWS SQS │
│ Redis PostgreSQL Prometheus │
└─────────────────────────────────────────────────────────────┘
How it works:
- You create a
ScaledObjectreferencing your Deployment and a trigger (e.g., Kafka topic) - KEDA's Operator watches the
ScaledObject - KEDA queries the external metric (Kafka consumer lag, SQS depth, etc.)
- KEDA updates the HPA's
minReplicasand target metric accordingly - Kubernetes HPA scales the Deployment
KEDA acts as a metric adapter — it translates external event data into the format Kubernetes HPA understands.
Installing KEDA
# Using Helm (recommended)
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
--namespace keda \
--create-namespaceVerify installation:
kubectl get pods -n keda
# Should show:
# keda-operator-xxx Running
# keda-operator-metrics-apiserver-xxx RunningCheck the CRDs:
kubectl get crd | grep keda
# scaledobjects.keda.sh
# scaledjobs.keda.sh
# triggerauthentications.keda.sh
# clustertriggerauthentications.keda.shScaledObject: The Core Resource
A ScaledObject links your Deployment to a trigger. Here's the general structure:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-scaler
namespace: default
spec:
scaleTargetRef:
name: my-deployment # the Deployment to scale
minReplicaCount: 0 # scale to zero when idle
maxReplicaCount: 20 # maximum pods
cooldownPeriod: 300 # seconds to wait before scaling down
pollingInterval: 30 # check external metric every 30s
triggers:
- type: <scaler-type>
metadata:
<scaler-specific-config>Real-World Examples
1. Scale Based on Kafka Consumer Lag
The most common KEDA use case — scale worker pods based on how many unprocessed messages are in a Kafka topic.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-consumer-scaler
namespace: production
spec:
scaleTargetRef:
name: order-processor
minReplicaCount: 1
maxReplicaCount: 50
triggers:
- type: kafka
metadata:
bootstrapServers: kafka-broker:9092
consumerGroup: order-processors
topic: orders
lagThreshold: "100" # scale up when lag > 100 messages per pod
offsetResetPolicy: latestWhat this does: If 500 messages are queued and the threshold is 100 per pod, KEDA scales to 5 pods. If 2000 messages queue up, it scales to 20 pods (maxReplicaCount). When the queue drains, it scales back down to 1 (minReplicaCount).
For authentication with Kafka:
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: kafka-trigger-auth
namespace: production
spec:
secretTargetRef:
- parameter: sasl.username
name: kafka-secrets
key: username
- parameter: sasl.password
name: kafka-secrets
key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-consumer-scaler
namespace: production
spec:
scaleTargetRef:
name: order-processor
triggers:
- type: kafka
authenticationRef:
name: kafka-trigger-auth
metadata:
bootstrapServers: kafka-broker:9092
consumerGroup: order-processors
topic: orders
lagThreshold: "100"
saslType: plaintext
tls: enable2. Scale Based on AWS SQS Queue Depth
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: aws-trigger-auth
namespace: production
spec:
podIdentity:
provider: aws # use IRSA (IAM Role for Service Account)
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: sqs-worker-scaler
namespace: production
spec:
scaleTargetRef:
name: sqs-worker
minReplicaCount: 0 # scale to zero when queue empty
maxReplicaCount: 30
cooldownPeriod: 60
triggers:
- type: aws-sqs-queue
authenticationRef:
name: aws-trigger-auth
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/123456789/my-queue
queueLength: "10" # 1 pod per 10 messages
awsRegion: us-east-1The IAM role needs:
{
"Effect": "Allow",
"Action": [
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl"
],
"Resource": "arn:aws:sqs:us-east-1:123456789:my-queue"
}3. Scale Based on RabbitMQ Queue Depth
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: rabbitmq-auth
namespace: production
spec:
secretTargetRef:
- parameter: host
name: rabbitmq-secret
key: connection-string
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: rabbitmq-consumer-scaler
namespace: production
spec:
scaleTargetRef:
name: email-sender
minReplicaCount: 0
maxReplicaCount: 10
triggers:
- type: rabbitmq
authenticationRef:
name: rabbitmq-auth
metadata:
protocol: amqp
queueName: email-notifications
mode: QueueLength
value: "50" # 1 pod per 50 messages4. Scale Based on Prometheus Metrics
KEDA can scale on any Prometheus query — perfect for custom business metrics.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaler
namespace: production
spec:
scaleTargetRef:
name: api-server
minReplicaCount: 2
maxReplicaCount: 20
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
metricName: http_requests_total
threshold: "100"
query: sum(rate(http_requests_total{app="api-server"}[2m]))This scales based on actual HTTP request rate from Prometheus — much more meaningful than CPU%.
5. Scale Based on Redis List Length
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: redis-auth
namespace: production
spec:
secretTargetRef:
- parameter: password
name: redis-secret
key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: redis-worker-scaler
namespace: production
spec:
scaleTargetRef:
name: task-worker
minReplicaCount: 0
maxReplicaCount: 15
triggers:
- type: redis
authenticationRef:
name: redis-auth
metadata:
address: redis.production.svc.cluster.local:6379
listName: task-queue
listLength: "20"ScaledJob: For Batch Workloads
For batch processing (not long-running services), use ScaledJob instead of ScaledObject. Each trigger event creates a new Job:
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: image-processor-job
namespace: production
spec:
jobTargetRef:
template:
spec:
containers:
- name: processor
image: my-image-processor:latest
restartPolicy: Never
minReplicaCount: 0
maxReplicaCount: 50
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/123456789/image-jobs
queueLength: "1" # one job per message
awsRegion: us-east-1Each SQS message creates one Kubernetes Job. Perfect for image processing, PDF generation, report generation, etc.
Monitoring KEDA
Check scaler status:
# See all ScaledObjects
kubectl get scaledobjects -A
# Check a specific scaler
kubectl describe scaledobject kafka-consumer-scaler -n production
# See HPA managed by KEDA
kubectl get hpa -n productionWatch scaling events:
kubectl get events -n production --sort-by='.lastTimestamp' | grep -i kedaKEDA exposes Prometheus metrics at port 8080:
keda_scaler_active - Whether scaler is active
keda_scaler_metrics_value - Current metric value from external source
keda_scaler_running_replicas - Current replica count
KEDA Best Practices
1. Set appropriate cooldown periods
spec:
cooldownPeriod: 300 # wait 5 minutes before scaling downScale-down too fast causes thrashing. Scale-down too slow wastes resources. 300 seconds is a good default for most workloads.
2. Use minReplicaCount: 1 for latency-sensitive services
Scale-to-zero is great for batch workers, but bad for services with latency requirements. Cold start from 0 pods takes time:
minReplicaCount: 1 # always keep one pod warm
maxReplicaCount: 503. Test your TriggerAuthentication separately
Before deploying a full ScaledObject, verify KEDA can authenticate and read the metric:
kubectl describe scaledobject my-scaler -n production
# Look for: "Successfully set ScaleTarget Deployment"
# and: "Scaler kafka is active"4. Set resource requests on scaled pods
KEDA relies on Kubernetes scheduling — if your pods don't have resource requests, the cluster autoscaler won't know to add nodes.
Learn More
Want hands-on practice with KEDA in a real Kubernetes environment? KodeKloud's Kubernetes courses cover autoscaling, event-driven architectures, and production Kubernetes patterns with real lab exercises.
Summary
KEDA transforms Kubernetes autoscaling from "scale on CPU" to "scale on what actually matters":
| Scaler | Scale Signal |
|---|---|
| Kafka | Consumer group lag |
| AWS SQS | Queue depth |
| RabbitMQ | Queue message count |
| Redis | List length |
| Prometheus | Any metric |
| Azure Service Bus | Message count |
| PostgreSQL | Row count query |
| HTTP | Pending request count |
60+ scalers, scale-to-zero capability, and CNCF graduated maturity make KEDA the right tool for any Kubernetes workload that processes external events. If you're still using CPU-based HPA for queue workers, KEDA is your upgrade.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
ArgoCD vs Flux vs Jenkins — GitOps Comparison 2026
A deep-dive comparison of the three most popular GitOps and CI/CD tools — ArgoCD, Flux CD, and Jenkins. Learn which one fits your team, use case, and Kubernetes setup.
Best DevOps Tools Every Engineer Should Know in 2026
A comprehensive guide to the essential DevOps tools for containers, CI/CD, infrastructure, monitoring, and security — curated for practicing engineers.