All Articles

KEDA: The Complete Guide to Kubernetes Event-Driven Autoscaling (2026)

KEDA lets Kubernetes scale workloads based on any external event source — Kafka, RabbitMQ, SQS, Redis, HTTP, and 60+ more. This guide covers architecture, installation, and real-world ScaledObject examples.

DevOpsBoysMar 15, 20266 min read
Share:Tweet

Kubernetes has built-in autoscaling — the Horizontal Pod Autoscaler (HPA) scales workloads based on CPU and memory. But CPU and memory are proxies. What you actually care about is your workload's real demand signal.

How many messages are in your Kafka queue? How long is your SQS backlog? How many items are waiting in RabbitMQ?

KEDA — Kubernetes Event Driven Autoscaler — answers those questions and scales your pods accordingly.


What Is KEDA?

KEDA is a CNCF graduated project that extends Kubernetes with event-driven autoscaling. It adds two things:

  1. A custom ScaledObject resource — you define what external metric to watch
  2. 60+ built-in scalers — pre-built integrations for every major event source

With KEDA, you can scale:

  • A consumer deployment based on Kafka consumer group lag
  • A worker pool based on RabbitMQ queue depth
  • A job runner based on SQS queue length
  • An API service based on Redis list length
  • A batch processor based on PostgreSQL row count
  • Any workload based on a Prometheus query result

And critically — KEDA can scale to zero. When there are no messages to process, it scales your deployment to 0 pods. When messages arrive, it scales back up instantly.


KEDA Architecture

┌─────────────────────────────────────────────────────────────┐
│                     KEDA Architecture                        │
│                                                             │
│  ┌──────────────┐     ┌──────────────┐     ┌────────────┐  │
│  │ ScaledObject │────▶│ KEDA Operator│────▶│   HPA      │  │
│  │ (your config)│     │              │     │ (K8s built-in│  │
│  └──────────────┘     └──────┬───────┘     └────────────┘  │
│                              │                              │
│                    ┌─────────▼─────────┐                   │
│                    │  External Metrics  │                   │
│                    │  Server           │                   │
│                    └─────────┬─────────┘                   │
│                              │                              │
│           ┌──────────────────┼──────────────────┐          │
│           ▼                  ▼                  ▼          │
│        Kafka              RabbitMQ            AWS SQS       │
│        Redis              PostgreSQL          Prometheus    │
└─────────────────────────────────────────────────────────────┘

How it works:

  1. You create a ScaledObject referencing your Deployment and a trigger (e.g., Kafka topic)
  2. KEDA's Operator watches the ScaledObject
  3. KEDA queries the external metric (Kafka consumer lag, SQS depth, etc.)
  4. KEDA updates the HPA's minReplicas and target metric accordingly
  5. Kubernetes HPA scales the Deployment

KEDA acts as a metric adapter — it translates external event data into the format Kubernetes HPA understands.


Installing KEDA

bash
# Using Helm (recommended)
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
 
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace

Verify installation:

bash
kubectl get pods -n keda
 
# Should show:
# keda-operator-xxx            Running
# keda-operator-metrics-apiserver-xxx   Running

Check the CRDs:

bash
kubectl get crd | grep keda
# scaledobjects.keda.sh
# scaledjobs.keda.sh
# triggerauthentications.keda.sh
# clustertriggerauthentications.keda.sh

ScaledObject: The Core Resource

A ScaledObject links your Deployment to a trigger. Here's the general structure:

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: my-deployment    # the Deployment to scale
 
  minReplicaCount: 0       # scale to zero when idle
  maxReplicaCount: 20      # maximum pods
 
  cooldownPeriod: 300      # seconds to wait before scaling down
  pollingInterval: 30      # check external metric every 30s
 
  triggers:
    - type: <scaler-type>
      metadata:
        <scaler-specific-config>

Real-World Examples

1. Scale Based on Kafka Consumer Lag

The most common KEDA use case — scale worker pods based on how many unprocessed messages are in a Kafka topic.

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor
  minReplicaCount: 1
  maxReplicaCount: 50
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka-broker:9092
        consumerGroup: order-processors
        topic: orders
        lagThreshold: "100"       # scale up when lag > 100 messages per pod
        offsetResetPolicy: latest

What this does: If 500 messages are queued and the threshold is 100 per pod, KEDA scales to 5 pods. If 2000 messages queue up, it scales to 20 pods (maxReplicaCount). When the queue drains, it scales back down to 1 (minReplicaCount).

For authentication with Kafka:

yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-trigger-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: sasl.username
      name: kafka-secrets
      key: username
    - parameter: sasl.password
      name: kafka-secrets
      key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor
  triggers:
    - type: kafka
      authenticationRef:
        name: kafka-trigger-auth
      metadata:
        bootstrapServers: kafka-broker:9092
        consumerGroup: order-processors
        topic: orders
        lagThreshold: "100"
        saslType: plaintext
        tls: enable

2. Scale Based on AWS SQS Queue Depth

yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: aws-trigger-auth
  namespace: production
spec:
  podIdentity:
    provider: aws    # use IRSA (IAM Role for Service Account)
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-worker-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: sqs-worker
  minReplicaCount: 0       # scale to zero when queue empty
  maxReplicaCount: 30
  cooldownPeriod: 60
  triggers:
    - type: aws-sqs-queue
      authenticationRef:
        name: aws-trigger-auth
      metadata:
        queueURL: https://sqs.us-east-1.amazonaws.com/123456789/my-queue
        queueLength: "10"     # 1 pod per 10 messages
        awsRegion: us-east-1

The IAM role needs:

json
{
  "Effect": "Allow",
  "Action": [
    "sqs:GetQueueAttributes",
    "sqs:GetQueueUrl"
  ],
  "Resource": "arn:aws:sqs:us-east-1:123456789:my-queue"
}

3. Scale Based on RabbitMQ Queue Depth

yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: rabbitmq-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: host
      name: rabbitmq-secret
      key: connection-string
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: email-sender
  minReplicaCount: 0
  maxReplicaCount: 10
  triggers:
    - type: rabbitmq
      authenticationRef:
        name: rabbitmq-auth
      metadata:
        protocol: amqp
        queueName: email-notifications
        mode: QueueLength
        value: "50"    # 1 pod per 50 messages

4. Scale Based on Prometheus Metrics

KEDA can scale on any Prometheus query — perfect for custom business metrics.

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 2
  maxReplicaCount: 20
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
        metricName: http_requests_total
        threshold: "100"
        query: sum(rate(http_requests_total{app="api-server"}[2m]))

This scales based on actual HTTP request rate from Prometheus — much more meaningful than CPU%.


5. Scale Based on Redis List Length

yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: redis-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: password
      name: redis-secret
      key: password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: redis-worker-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: task-worker
  minReplicaCount: 0
  maxReplicaCount: 15
  triggers:
    - type: redis
      authenticationRef:
        name: redis-auth
      metadata:
        address: redis.production.svc.cluster.local:6379
        listName: task-queue
        listLength: "20"

ScaledJob: For Batch Workloads

For batch processing (not long-running services), use ScaledJob instead of ScaledObject. Each trigger event creates a new Job:

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: image-processor-job
  namespace: production
spec:
  jobTargetRef:
    template:
      spec:
        containers:
          - name: processor
            image: my-image-processor:latest
        restartPolicy: Never
  minReplicaCount: 0
  maxReplicaCount: 50
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  triggers:
    - type: aws-sqs-queue
      metadata:
        queueURL: https://sqs.us-east-1.amazonaws.com/123456789/image-jobs
        queueLength: "1"    # one job per message
        awsRegion: us-east-1

Each SQS message creates one Kubernetes Job. Perfect for image processing, PDF generation, report generation, etc.


Monitoring KEDA

Check scaler status:

bash
# See all ScaledObjects
kubectl get scaledobjects -A
 
# Check a specific scaler
kubectl describe scaledobject kafka-consumer-scaler -n production
 
# See HPA managed by KEDA
kubectl get hpa -n production

Watch scaling events:

bash
kubectl get events -n production --sort-by='.lastTimestamp' | grep -i keda

KEDA exposes Prometheus metrics at port 8080:

keda_scaler_active             - Whether scaler is active
keda_scaler_metrics_value      - Current metric value from external source
keda_scaler_running_replicas   - Current replica count

KEDA Best Practices

1. Set appropriate cooldown periods

yaml
spec:
  cooldownPeriod: 300    # wait 5 minutes before scaling down

Scale-down too fast causes thrashing. Scale-down too slow wastes resources. 300 seconds is a good default for most workloads.

2. Use minReplicaCount: 1 for latency-sensitive services

Scale-to-zero is great for batch workers, but bad for services with latency requirements. Cold start from 0 pods takes time:

yaml
minReplicaCount: 1    # always keep one pod warm
maxReplicaCount: 50

3. Test your TriggerAuthentication separately

Before deploying a full ScaledObject, verify KEDA can authenticate and read the metric:

bash
kubectl describe scaledobject my-scaler -n production
# Look for: "Successfully set ScaleTarget Deployment"
# and: "Scaler kafka is active"

4. Set resource requests on scaled pods

KEDA relies on Kubernetes scheduling — if your pods don't have resource requests, the cluster autoscaler won't know to add nodes.


Learn More

Want hands-on practice with KEDA in a real Kubernetes environment? KodeKloud's Kubernetes courses cover autoscaling, event-driven architectures, and production Kubernetes patterns with real lab exercises.


Summary

KEDA transforms Kubernetes autoscaling from "scale on CPU" to "scale on what actually matters":

ScalerScale Signal
KafkaConsumer group lag
AWS SQSQueue depth
RabbitMQQueue message count
RedisList length
PrometheusAny metric
Azure Service BusMessage count
PostgreSQLRow count query
HTTPPending request count

60+ scalers, scale-to-zero capability, and CNCF graduated maturity make KEDA the right tool for any Kubernetes workload that processes external events. If you're still using CPU-based HPA for queue workers, KEDA is your upgrade.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments