KEDA: The Complete Guide to Kubernetes Event-Driven Autoscaling (2026)

KEDA lets Kubernetes scale workloads based on any external event source — Kafka, RabbitMQ, SQS, Redis, HTTP, and 60+ more. This guide covers architecture, installation, and real-world ScaledObject examples.

Kubernetes has built-in autoscaling — the Horizontal Pod Autoscaler (HPA) scales workloads based on CPU and memory. But CPU and memory are proxies. What you actually care about is your workload's real demand signal.

How many messages are in your Kafka queue? How long is your SQS backlog? How many items are waiting in RabbitMQ?

KEDA — Kubernetes Event Driven Autoscaler — answers those questions and scales your pods accordingly.

What Is KEDA?

KEDA is a CNCF graduated project that extends Kubernetes with event-driven autoscaling. It adds two things:

A custom ScaledObject resource — you define what external metric to watch
60+ built-in scalers — pre-built integrations for every major event source

With KEDA, you can scale:

A consumer deployment based on Kafka consumer group lag
A worker pool based on RabbitMQ queue depth
A job runner based on SQS queue length
An API service based on Redis list length
A batch processor based on PostgreSQL row count
Any workload based on a Prometheus query result

And critically — KEDA can scale to zero. When there are no messages to process, it scales your deployment to 0 pods. When messages arrive, it scales back up instantly.

KEDA Architecture

┌─────────────────────────────────────────────────────────────┐
│                     KEDA Architecture                        │
│                                                             │
│  ┌──────────────┐     ┌──────────────┐     ┌────────────┐  │
│  │ ScaledObject │────▶│ KEDA Operator│────▶│   HPA      │  │
│  │ (your config)│     │              │     │ (K8s built-in│  │
│  └──────────────┘     └──────┬───────┘     └────────────┘  │
│                              │                              │
│                    ┌─────────▼─────────┐                   │
│                    │  External Metrics  │                   │
│                    │  Server           │                   │
│                    └─────────┬─────────┘                   │
│                              │                              │
│           ┌──────────────────┼──────────────────┐          │
│           ▼                  ▼                  ▼          │
│        Kafka              RabbitMQ            AWS SQS       │
│        Redis              PostgreSQL          Prometheus    │
└─────────────────────────────────────────────────────────────┘

How it works:

You create a ScaledObject referencing your Deployment and a trigger (e.g., Kafka topic)
KEDA's Operator watches the ScaledObject
KEDA queries the external metric (Kafka consumer lag, SQS depth, etc.)
KEDA updates the HPA's minReplicas and target metric accordingly
Kubernetes HPA scales the Deployment

KEDA acts as a metric adapter — it translates external event data into the format Kubernetes HPA understands.

Installing KEDA

bash

# Using Helm (recommended)
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
 
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace

Verify installation:

bash

kubectl get pods -n keda
 
# Should show:
# keda-operator-xxx            Running
# keda-operator-metrics-apiserver-xxx   Running

Check the CRDs:

bash

kubectl get crd | grep keda
# scaledobjects.keda.sh
# scaledjobs.keda.sh
# triggerauthentications.keda.sh
# clustertriggerauthentications.keda.sh

ScaledObject: The Core Resource

A ScaledObject links your Deployment to a trigger. Here's the general structure:

yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: my-deployment    # the Deployment to scale
 
  minReplicaCount: 0       # scale to zero when idle
  maxReplicaCount: 20      # maximum pods
 
  cooldownPeriod: 300      # seconds to wait before scaling down
  pollingInterval: 30      # check external metric every 30s
 
  triggers:
    - type: <scaler-type>
      metadata:
        <scaler-specific-config>

Real-World Examples

1. Scale Based on Kafka Consumer Lag

The most common KEDA use case — scale worker pods based on how many unprocessed messages are in a Kafka topic.

yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor
  minReplicaCount: 1
  maxReplicaCount: 50
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka-broker:9092
        consumerGroup: order-processors
        topic: orders
        lagThreshold: "100"       # scale up when lag > 100 messages per pod
        offsetResetPolicy: latest

What this does: If 500 messages are queued and the threshold is 100 per pod, KEDA scales to 5 pods. If 2000 messages queue up, it scales to 20 pods (maxReplicaCount). When the queue drains, it scales back down to 1 (minReplicaCount).

For authentication with Kafka:

yaml

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-trigger-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: sasl.username
      name: kafka-secrets
      key: username
    - parameter: sasl.password
      name: kafka-secrets
      key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor
  triggers:
    - type: kafka
      authenticationRef:
        name: kafka-trigger-auth
      metadata:
        bootstrapServers: kafka-broker:9092
        consumerGroup: order-processors
        topic: orders
        lagThreshold: "100"
        saslType: plaintext
        tls: enable

2. Scale Based on AWS SQS Queue Depth

yaml

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: aws-trigger-auth
  namespace: production
spec:
  podIdentity:
    provider: aws    # use IRSA (IAM Role for Service Account)
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-worker-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: sqs-worker
  minReplicaCount: 0       # scale to zero when queue empty
  maxReplicaCount: 30
  cooldownPeriod: 60
  triggers:
    - type: aws-sqs-queue
      authenticationRef:
        name: aws-trigger-auth
      metadata:
        queueURL: https://sqs.us-east-1.amazonaws.com/123456789/my-queue
        queueLength: "10"     # 1 pod per 10 messages
        awsRegion: us-east-1

The IAM role needs:

json

{
  "Effect": "Allow",
  "Action": [
    "sqs:GetQueueAttributes",
    "sqs:GetQueueUrl"
  ],
  "Resource": "arn:aws:sqs:us-east-1:123456789:my-queue"
}

3. Scale Based on RabbitMQ Queue Depth

yaml

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: rabbitmq-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: host
      name: rabbitmq-secret
      key: connection-string
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: email-sender
  minReplicaCount: 0
  maxReplicaCount: 10
  triggers:
    - type: rabbitmq
      authenticationRef:
        name: rabbitmq-auth
      metadata:
        protocol: amqp
        queueName: email-notifications
        mode: QueueLength
        value: "50"    # 1 pod per 50 messages

4. Scale Based on Prometheus Metrics

KEDA can scale on any Prometheus query — perfect for custom business metrics.

yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 2
  maxReplicaCount: 20
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
        metricName: http_requests_total
        threshold: "100"
        query: sum(rate(http_requests_total{app="api-server"}[2m]))

This scales based on actual HTTP request rate from Prometheus — much more meaningful than CPU%.

5. Scale Based on Redis List Length

yaml

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: redis-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: password
      name: redis-secret
      key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: redis-worker-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: task-worker
  minReplicaCount: 0
  maxReplicaCount: 15
  triggers:
    - type: redis
      authenticationRef:
        name: redis-auth
      metadata:
        address: redis.production.svc.cluster.local:6379
        listName: task-queue
        listLength: "20"

ScaledJob: For Batch Workloads

For batch processing (not long-running services), use ScaledJob instead of ScaledObject. Each trigger event creates a new Job:

yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: image-processor-job
  namespace: production
spec:
  jobTargetRef:
    template:
      spec:
        containers:
          - name: processor
            image: my-image-processor:latest
        restartPolicy: Never
  minReplicaCount: 0
  maxReplicaCount: 50
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  triggers:
    - type: aws-sqs-queue
      metadata:
        queueURL: https://sqs.us-east-1.amazonaws.com/123456789/image-jobs
        queueLength: "1"    # one job per message
        awsRegion: us-east-1

Each SQS message creates one Kubernetes Job. Perfect for image processing, PDF generation, report generation, etc.

Monitoring KEDA

Check scaler status:

bash

# See all ScaledObjects
kubectl get scaledobjects -A
 
# Check a specific scaler
kubectl describe scaledobject kafka-consumer-scaler -n production
 
# See HPA managed by KEDA
kubectl get hpa -n production

Watch scaling events:

bash

kubectl get events -n production --sort-by='.lastTimestamp' | grep -i keda

KEDA exposes Prometheus metrics at port 8080:

keda_scaler_active             - Whether scaler is active
keda_scaler_metrics_value      - Current metric value from external source
keda_scaler_running_replicas   - Current replica count

KEDA Best Practices

1. Set appropriate cooldown periods

yaml

spec:
  cooldownPeriod: 300    # wait 5 minutes before scaling down

Scale-down too fast causes thrashing. Scale-down too slow wastes resources. 300 seconds is a good default for most workloads.

2. Use minReplicaCount: 1 for latency-sensitive services

Scale-to-zero is great for batch workers, but bad for services with latency requirements. Cold start from 0 pods takes time:

yaml

minReplicaCount: 1    # always keep one pod warm
maxReplicaCount: 50

3. Test your TriggerAuthentication separately

Before deploying a full ScaledObject, verify KEDA can authenticate and read the metric:

bash

kubectl describe scaledobject my-scaler -n production
# Look for: "Successfully set ScaleTarget Deployment"
# and: "Scaler kafka is active"

4. Set resource requests on scaled pods

KEDA relies on Kubernetes scheduling — if your pods don't have resource requests, the cluster autoscaler won't know to add nodes.

Learn More

Want hands-on practice with KEDA in a real Kubernetes environment? KodeKloud's Kubernetes courses cover autoscaling, event-driven architectures, and production Kubernetes patterns with real lab exercises.

Summary

KEDA transforms Kubernetes autoscaling from "scale on CPU" to "scale on what actually matters":

Scaler	Scale Signal
Kafka	Consumer group lag
AWS SQS	Queue depth
RabbitMQ	Queue message count
Redis	List length
Prometheus	Any metric
Azure Service Bus	Message count
PostgreSQL	Row count query
HTTP	Pending request count

60+ scalers, scale-to-zero capability, and CNCF graduated maturity make KEDA the right tool for any Kubernetes workload that processes external events. If you're still using CPU-based HPA for queue workers, KEDA is your upgrade.

KEDA: The Complete Guide to Kubernetes Event-Driven Autoscaling (2026)

What Is KEDA?

KEDA Architecture

Installing KEDA

ScaledObject: The Core Resource

Real-World Examples

1. Scale Based on Kafka Consumer Lag

2. Scale Based on AWS SQS Queue Depth

3. Scale Based on RabbitMQ Queue Depth

4. Scale Based on Prometheus Metrics

5. Scale Based on Redis List Length

ScaledJob: For Batch Workloads

Monitoring KEDA

KEDA Best Practices

Learn More

Summary

Stay ahead of the curve

Related Articles

Kafka vs RabbitMQ vs Redis Streams — Which Message Queue to Use? (2026)

AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds

Argo Rollouts vs Flagger — Which Canary Deployment Tool Should You Use? (2026)

Comments