What is a Kubernetes Operator? Explained Simply (2026)
Kubernetes Operators sound complex but they solve a simple problem: automating the management of stateful applications. Here's what they are and how they work.
You've heard "Kubernetes Operator" thrown around — Prometheus Operator, Cert-Manager, Strimzi Kafka Operator. But what actually is an operator? Why does Kubernetes need them?
The Problem Operators Solve
Kubernetes is great at managing stateless apps. A Deployment handles rolling updates, restarts crashed pods, scales up and down. Simple.
But stateful applications are different. Consider PostgreSQL:
- You can't just kill a primary and start a new one — you'll lose data
- Adding a replica requires running specific
pg_basebackupcommands - Failover requires promoting a replica and updating connection strings
- Schema migrations have to run in the right order
Kubernetes doesn't know any of this. A Deployment just restarts pods.
An Operator encodes the human operational knowledge for running a complex application. It automates the things a database administrator would do manually.
What an Operator Actually Is
An Operator is two things:
- Custom Resource Definition (CRD) — extends Kubernetes with new resource types
- Controller — a program that watches those custom resources and takes action
Instead of running a PostgreSQL container yourself, you create a custom resource:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: my-postgres
spec:
instances: 3
storage:
size: 100Gi
postgresql:
version: "16"
backup:
target: prefer-standby
retentionPolicy: "30d"The Postgres Operator sees this, then:
- Creates 3 pods (1 primary, 2 replicas)
- Sets up streaming replication
- Configures connection pooling
- Sets up automated backups
- Handles failover if the primary dies
You describe what you want. The operator figures out how.
How Controllers Work
The controller loop is simple:
1. Watch for changes to Custom Resources
2. Compare desired state (what you wrote in YAML) vs actual state (what's running)
3. Take action to reconcile the difference
4. Repeat forever
This is called the reconciliation loop — the same pattern Kubernetes uses for built-in resources.
Real-World Operators You Already Use
cert-manager — You create a Certificate resource, cert-manager handles ACME challenges, obtains the TLS cert from Let's Encrypt, stores it as a Kubernetes Secret, and renews it before expiry.
Prometheus Operator — You create ServiceMonitor and PrometheusRule resources. The operator configures Prometheus scraping and alerting automatically. No manual prometheus.yml editing.
ArgoCD — You create Application resources. ArgoCD syncs your cluster to match Git state.
Strimzi — You create Kafka and KafkaTopic resources. Strimzi manages the full Kafka cluster lifecycle.
Crossplane — You create RDSInstance or S3Bucket resources. Crossplane provisions real AWS/GCP/Azure resources.
Custom Resource Definition Example
# Define the new resource type
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: backups.database.example.com
spec:
group: database.example.com
names:
kind: Backup
plural: backups
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
database:
type: string
schedule:
type: stringNow users can create Backup resources and your operator handles them.
When to Use an Operator
Use an existing operator when:
- Installing databases (PostgreSQL, MySQL, MongoDB, Redis)
- Setting up Kafka, Elasticsearch
- Managing TLS certificates (cert-manager)
- Setting up monitoring (Prometheus Operator)
Build a custom operator when:
- You have internal stateful systems with complex operational procedures
- You're building a Platform Engineering product
- You want to give developers a simple API that hides infrastructure complexity
Operator Frameworks
If you want to build your own:
- Operator SDK (Go or Ansible) — official Kubernetes project
- Kubebuilder — Go framework, generates boilerplate
- Kopf (Python) — lighter weight, great for simpler operators
For most teams: use existing community operators. Build custom ones only when you're building an internal platform product.
Resources
- Kubernetes Cheatsheet
- Platform Engineering Guide
- OperatorHub.io — catalog of community operators
- Course: CKA with Practice Tests
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Build a Kubernetes Cluster with kubeadm from Scratch (2026)
Step-by-step guide to building a real multi-node Kubernetes cluster using kubeadm — no managed services, no shortcuts.
Edge Computing Will Decentralize Kubernetes by 2028
Why Kubernetes is moving from centralized cloud clusters to distributed edge deployments. Covers KubeEdge, k3s, Akri, and the architectural shift toward edge-native infrastructure.
How to Migrate from Ingress-NGINX to Kubernetes Gateway API in 2026
Step-by-step guide to migrating from Ingress-NGINX to Kubernetes Gateway API. Includes YAML examples, implementation choices, testing strategy, and cutover plan.