How to Implement Canary Deployments with Flagger on Kubernetes (2026)
Flagger automates canary deployments on Kubernetes — progressively shifting traffic to new versions and rolling back automatically if metrics degrade. This step-by-step guide shows you how to set it up with Nginx Ingress.
Deploying a new version of your service to 100% of production traffic all at once is a risk. Even with good testing, you don't know how new code behaves under real production traffic until it's live.
Canary deployments solve this: you shift a small percentage of traffic (say, 5%) to the new version first. If it behaves well — no increase in errors, no latency regression — you gradually increase to 10%, 20%, 50%, then 100%. If anything goes wrong, you roll back to the old version automatically.
Flagger is the Kubernetes tool that automates this entire process. It watches your metrics and handles the traffic shifting, promotion, and rollback — without human intervention.
What Is Flagger?
Flagger is a CNCF project that automates progressive delivery on Kubernetes. It supports:
- Canary releases: gradual traffic shifting
- A/B testing: traffic routing based on HTTP headers or cookies
- Blue/Green: traffic switching with testing
Flagger integrates with:
- Traffic routing: Nginx Ingress, Istio, Linkerd, Contour, Traefik
- Metrics: Prometheus, Datadog, New Relic, CloudWatch
- Notifications: Slack, Teams, Discord, generic webhooks
This guide uses Nginx Ingress + Prometheus — the most common setup.
How Flagger Canary Works
When you create a Canary resource, Flagger:
- Creates a
primarydeployment (the current stable version) - Creates a
canarydeployment (the new version under test) - Adjusts Ingress rules to split traffic: e.g., 95% → primary, 5% → canary
- Checks your Prometheus metrics every analysis interval
- If metrics are healthy: increase canary weight (5% → 10% → 20%... → 100%)
- If metrics degrade: automatically roll back to 100% primary
- If promotion succeeds: primary becomes the new version, canary is cleaned up
Deploy new version
│
▼
[Canary: 5% traffic] ──── check metrics ──── OK ──▶ [10%] ──▶ [20%] ──▶ [100%] ──▶ PROMOTED
│
DEGRADED
│
▼
ROLLBACK (0% canary)
Prerequisites
- Kubernetes cluster (1.24+)
- Nginx Ingress Controller installed
- Prometheus installed (for metrics analysis)
- Helm installed
Step 1: Install Flagger
helm repo add flagger https://flagger.app
helm repo update
# Install Flagger with Nginx Ingress provider
helm install flagger flagger/flagger \
--namespace flagger-system \
--create-namespace \
--set meshProvider=nginx \
--set metricsServer=http://prometheus.monitoring.svc.cluster.local:9090Install the Prometheus addon (if you don't have Prometheus yet):
helm install flagger-prometheus flagger/prometheus \
--namespace flagger-systemVerify Flagger is running:
kubectl get pods -n flagger-system
# NAME READY STATUS RESTARTS
# flagger-xxx 1/1 Running 0
# flagger-prometheus-xxx 1/1 Running 0Step 2: Set Up Your Deployment
Your app needs a standard Kubernetes Deployment and Service. Flagger will manage the canary variants automatically.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
labels:
app: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: app
image: my-app:1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: production
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app
namespace: production
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app
port:
number: 80Apply them:
kubectl apply -f deployment.yaml
kubectl apply -f ingress.yamlStep 3: Create the Flagger Canary Resource
This is where you define the canary strategy:
# canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-app
namespace: production
spec:
# The deployment Flagger will manage
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
# The Ingress to modify for traffic splitting
ingressRef:
apiVersion: networking.k8s.io/v1
kind: Ingress
name: my-app
# Canary analysis settings
analysis:
# Check metrics every 60 seconds
interval: 1m
# Promote if metrics are good for 10 consecutive checks
threshold: 10
# Maximum number of failed checks before rollback
maxWeight: 50 # cap canary at 50%
stepWeight: 10 # increase by 10% each step
# Prometheus metrics to check
metrics:
- name: request-success-rate
# Must stay above 99% success rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
# P99 latency must stay below 500ms
thresholdRange:
max: 500
interval: 1m
# Slack notification
webhooks:
- name: slack-notification
type: slack
url: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
metadata:
channel: "#deployments"
username: "Flagger"Apply the Canary:
kubectl apply -f canary.yamlFlagger immediately takes over your Deployment. Check what it created:
kubectl get canary -n production
# NAME STATUS WEIGHT LASTTRANSITIONTIME
# my-app Initialized 0 2026-03-16T10:00:00Z
kubectl get deployments -n production
# NAME READY UP-TO-DATE
# my-app 2/2 2 ← Flagger created this (primary)
# my-app-primary 2/2 2 ← Stable versionStep 4: Trigger a Canary Release
To trigger the canary process, update your Deployment's image:
kubectl set image deployment/my-app app=my-app:2.0.0 -n productionOr update via your CI/CD pipeline. Flagger detects the image change and starts the analysis.
Watch the progression:
kubectl describe canary my-app -n productionYou'll see events like:
Events:
Normal Synced 1m flagger New revision detected! Scaling up my-app.production
Normal Synced 2m flagger Starting canary analysis for my-app.production
Normal Synced 3m flagger Advance my-app.production canary weight 10
Normal Synced 4m flagger Advance my-app.production canary weight 20
Normal Synced 5m flagger Advance my-app.production canary weight 30
Normal Synced 6m flagger Advance my-app.production canary weight 40
Normal Synced 7m flagger Advance my-app.production canary weight 50
Normal Synced 8m flagger Copying my-app.production template spec to my-app-primary.production
Normal Synced 9m flagger Routing all traffic to primary
Normal Synced 10m flagger Promotion completed! Scaling down my-app.production
The canary was promoted. Your app is now running 2.0.0 as the primary.
Step 5: Observe a Rollback
If your new version is bad, Flagger rolls back automatically. To simulate this, deploy a broken version:
kubectl set image deployment/my-app app=my-app:broken -n productionFlagger starts the canary. When the error rate exceeds your threshold, you'll see:
Events:
Normal Synced 1m flagger New revision detected! Scaling up my-app.production
Normal Synced 2m flagger Starting canary analysis for my-app.production
Warning Synced 3m flagger Halt my-app.production advancement success rate 87.50% < 99%
Warning Synced 4m flagger Halt my-app.production advancement success rate 82.30% < 99%
Warning Synced 5m flagger Halt my-app.production advancement success rate 79.10% < 99%
Warning Synced 6m flagger Rolling back my-app.production failed checks threshold reached 3
Warning Synced 7m flagger Canary failed! Scaling down my-app.production
All traffic returns to the stable primary. No human intervention needed.
Custom Prometheus Metrics
Define custom metric checks based on your application's Prometheus data:
analysis:
metrics:
# Built-in success rate metric
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
# Custom metric: database query latency
- name: db-query-latency
templateRef:
name: db-latency
namespace: flagger-system
thresholdRange:
max: 100 # max 100ms P99 DB query time
interval: 1mCreate the metric template:
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
name: db-latency
namespace: flagger-system
spec:
provider:
type: prometheus
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
histogram_quantile(0.99,
sum(
rate(
db_query_duration_seconds_bucket{
app="{{ target }}",
namespace="{{ namespace }}"
}[{{ interval }}]
)
) by (le)
) * 1000Integration with ArgoCD (GitOps)
If you're using ArgoCD, Flagger integrates naturally. Your Git repo contains:
k8s/
production/
deployment.yaml
service.yaml
ingress.yaml
canary.yaml # Flagger Canary resource
ArgoCD syncs these to the cluster. When your CI pipeline builds a new image and updates deployment.yaml with the new tag, ArgoCD syncs the change. Flagger detects the image update and runs the canary analysis automatically.
No manual steps. No human traffic management. Full GitOps with automated progressive delivery.
Slack Notifications
Add webhook notifications to track every canary event in Slack:
analysis:
webhooks:
- name: slack-notify
type: slack
url: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
metadata:
channel: "#deployments"You'll get Slack messages for:
- Canary started (new revision detected)
- Each traffic weight increase
- Successful promotion
- Rollback triggered
Quick Reference
# Watch canary status
kubectl get canary -n production -w
# See events
kubectl describe canary my-app -n production
# Manually promote (skip analysis)
kubectl annotate canary my-app flagger.app/promote=true -n production
# Manually rollback
kubectl annotate canary my-app flagger.app/rollback=true -n production
# Delete a canary (restores original deployment)
kubectl delete canary my-app -n productionLearn More
Want to learn progressive delivery, GitOps, and Kubernetes deployment patterns with hands-on labs? KodeKloud's Kubernetes courses cover Flagger, ArgoCD, and the full GitOps workflow with real cluster environments.
Summary
Flagger gives you production-grade canary deployments without writing a single traffic management script:
- Install Flagger with Nginx Ingress + Prometheus
- Deploy your app with standard K8s Deployment + Service + Ingress
- Create a Canary resource defining analysis interval, traffic step, and metric thresholds
- Push a new image — Flagger handles the rest
- Auto-promotion if metrics stay healthy, auto-rollback if they degrade
The result: every production deployment is automatically a canary deployment. Bad code gets caught before it reaches 100% of users. Good code gets promoted without human intervention.
That's how elite teams deploy confidently at high frequency.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
ArgoCD vs Flux vs Jenkins — GitOps Comparison 2026
A deep-dive comparison of the three most popular GitOps and CI/CD tools — ArgoCD, Flux CD, and Jenkins. Learn which one fits your team, use case, and Kubernetes setup.
Build a Complete CI/CD Pipeline with GitHub Actions + ArgoCD + EKS (2026)
A full project walkthrough — from a simple app to a production-grade GitOps pipeline with automated builds, image scanning, and deployments to AWS EKS using ArgoCD.
CI/CD Pipeline Is Broken: How to Debug and Fix GitHub Actions, Jenkins & ArgoCD Failures (2026)
Your CI/CD pipeline failed and you don't know why. This complete debugging guide covers GitHub Actions, Jenkins, and ArgoCD failures with real error messages and step-by-step fixes.