All Articles

What is a Kubernetes Operator? Explained Simply (2026)

Kubernetes Operators sound complex but they solve a simple problem: automating the management of stateful applications. Here's what they are and how they work.

DevOpsBoysApr 6, 20263 min read
Share:Tweet

You've heard "Kubernetes Operator" thrown around — Prometheus Operator, Cert-Manager, Strimzi Kafka Operator. But what actually is an operator? Why does Kubernetes need them?


The Problem Operators Solve

Kubernetes is great at managing stateless apps. A Deployment handles rolling updates, restarts crashed pods, scales up and down. Simple.

But stateful applications are different. Consider PostgreSQL:

  • You can't just kill a primary and start a new one — you'll lose data
  • Adding a replica requires running specific pg_basebackup commands
  • Failover requires promoting a replica and updating connection strings
  • Schema migrations have to run in the right order

Kubernetes doesn't know any of this. A Deployment just restarts pods.

An Operator encodes the human operational knowledge for running a complex application. It automates the things a database administrator would do manually.


What an Operator Actually Is

An Operator is two things:

  1. Custom Resource Definition (CRD) — extends Kubernetes with new resource types
  2. Controller — a program that watches those custom resources and takes action

Instead of running a PostgreSQL container yourself, you create a custom resource:

yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-postgres
spec:
  instances: 3
  storage:
    size: 100Gi
  postgresql:
    version: "16"
  backup:
    target: prefer-standby
    retentionPolicy: "30d"

The Postgres Operator sees this, then:

  • Creates 3 pods (1 primary, 2 replicas)
  • Sets up streaming replication
  • Configures connection pooling
  • Sets up automated backups
  • Handles failover if the primary dies

You describe what you want. The operator figures out how.


How Controllers Work

The controller loop is simple:

1. Watch for changes to Custom Resources
2. Compare desired state (what you wrote in YAML) vs actual state (what's running)
3. Take action to reconcile the difference
4. Repeat forever

This is called the reconciliation loop — the same pattern Kubernetes uses for built-in resources.


Real-World Operators You Already Use

cert-manager — You create a Certificate resource, cert-manager handles ACME challenges, obtains the TLS cert from Let's Encrypt, stores it as a Kubernetes Secret, and renews it before expiry.

Prometheus Operator — You create ServiceMonitor and PrometheusRule resources. The operator configures Prometheus scraping and alerting automatically. No manual prometheus.yml editing.

ArgoCD — You create Application resources. ArgoCD syncs your cluster to match Git state.

Strimzi — You create Kafka and KafkaTopic resources. Strimzi manages the full Kafka cluster lifecycle.

Crossplane — You create RDSInstance or S3Bucket resources. Crossplane provisions real AWS/GCP/Azure resources.


Custom Resource Definition Example

yaml
# Define the new resource type
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backups.database.example.com
spec:
  group: database.example.com
  names:
    kind: Backup
    plural: backups
  scope: Namespaced
  versions:
    - name: v1
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                database:
                  type: string
                schedule:
                  type: string

Now users can create Backup resources and your operator handles them.


When to Use an Operator

Use an existing operator when:

  • Installing databases (PostgreSQL, MySQL, MongoDB, Redis)
  • Setting up Kafka, Elasticsearch
  • Managing TLS certificates (cert-manager)
  • Setting up monitoring (Prometheus Operator)

Build a custom operator when:

  • You have internal stateful systems with complex operational procedures
  • You're building a Platform Engineering product
  • You want to give developers a simple API that hides infrastructure complexity

Operator Frameworks

If you want to build your own:

  • Operator SDK (Go or Ansible) — official Kubernetes project
  • Kubebuilder — Go framework, generates boilerplate
  • Kopf (Python) — lighter weight, great for simpler operators

For most teams: use existing community operators. Build custom ones only when you're building an internal platform product.


Resources

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments