What is Zero Trust Security Explained for DevOps Engineers
Zero Trust means never trust, always verify — even inside your network. Learn the core principles, how to implement it in Kubernetes and AWS, and the tools DevOps teams actually use.
Zero Trust is the security model where no device, user, or service is trusted by default — even if it's inside your own network. Every request must be verified, every connection authenticated, every access decision made based on identity and context.
The Old Model vs Zero Trust
Old perimeter model:
Internet → [Firewall] → Internal Network
↓
Everything inside = trusted
Anyone on VPN = trusted
Problem: One breach → attacker moves freely inside the network.
Zero Trust:
Any request → Verify identity → Check device posture →
→ Evaluate context → Apply least-privilege → Grant access
↑
Do this every time, even for internal services
Core Principles
- Never trust, always verify — authenticate every request regardless of origin
- Least privilege — grant minimum access required, nothing more
- Assume breach — design systems assuming an attacker is already inside
- Micro-segmentation — limit blast radius of any compromise
- Continuous verification — re-verify, don't assume a valid session stays valid
Zero Trust in Kubernetes
Mutual TLS (mTLS) Between Services
Without Zero Trust, Pod A can freely talk to Pod B inside the cluster. With mTLS, every service connection requires a certificate — the identity of the caller is verified cryptographically.
# Istio enables mTLS cluster-wide
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT # reject non-mTLS connectionsWith this, a compromised pod can't call other services without a valid certificate.
Authorization Policies
mTLS proves WHO is calling. Authorization policies control WHAT they can do:
# Only allow the frontend service to call the payments API
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: payments-access
namespace: production
spec:
selector:
matchLabels:
app: payments-api
rules:
- from:
- source:
principals: ["cluster.local/ns/production/sa/frontend"]
- to:
- operation:
methods: ["POST"]
paths: ["/api/v1/charge"]Even if a database pod is compromised, it cannot call the payments API.
Network Policies
# Default deny all ingress/egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Then explicitly allow what's needed
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-api-to-db
namespace: production
spec:
podSelector:
matchLabels:
app: database
ingress:
- from:
- podSelector:
matchLabels:
app: api-server
ports:
- port: 5432RBAC: Least Privilege for Service Accounts
# Service accounts should have minimal permissions
apiVersion: v1
kind: ServiceAccount
metadata:
name: api-service
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: api-role
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"] # only read configmaps
resourceNames: ["api-config"] # only this specific one
---
# No pod should have cluster-admin unless absolutely necessaryZero Trust in AWS
IAM: No Standing Permissions
# WRONG — broad permissions that are always on
resource "aws_iam_policy" "app_policy" {
policy = jsonencode({
Statement = [{
Effect = "Allow"
Action = "s3:*" # too broad
Resource = "*" # too broad
}]
})
}
# CORRECT — specific resources, specific actions
resource "aws_iam_policy" "app_policy" {
policy = jsonencode({
Statement = [{
Effect = "Allow"
Action = ["s3:GetObject", "s3:PutObject"]
Resource = "arn:aws:s3:::my-app-bucket/*"
Condition = {
StringEquals = {
"aws:RequestedRegion" = "us-east-1"
}
}
}]
})
}IRSA (IAM Roles for Service Accounts)
EKS pods should never use long-lived credentials. IRSA gives each pod a temporary IAM role:
apiVersion: v1
kind: ServiceAccount
metadata:
name: s3-reader
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/s3-reader-role# Trust policy for IRSA
data "aws_iam_policy_document" "assume_role" {
statement {
principals {
type = "Federated"
identifiers = [aws_iam_openid_connect_provider.eks.arn]
}
condition {
test = "StringEquals"
variable = "${aws_iam_openid_connect_provider.eks.url}:sub"
values = ["system:serviceaccount:production:s3-reader"]
}
}
}VPC Security: Micro-segmentation
# Each tier in its own subnet with strict security groups
resource "aws_security_group" "database" {
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.application.id] # only from app tier
}
egress {
# no outbound from database tier
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = []
}
}Tools for Zero Trust
| Tool | Use Case |
|---|---|
| Istio / Linkerd | mTLS + authorization policies between services |
| Cilium | eBPF-based network policy with identity-aware enforcement |
| Tailscale | Zero trust remote access for engineers |
| HashiCorp Vault | Dynamic secrets, no long-lived credentials |
| AWS IAM + IRSA | Identity-based access for cloud resources |
| Kyverno / OPA | Policy enforcement in Kubernetes admission |
| Falco | Runtime detection — alerts when behavior deviates |
Zero Trust vs VPN
| VPN | Zero Trust | |
|---|---|---|
| Default trust | Everything on VPN trusted | Nothing trusted by default |
| Blast radius | VPN access = internal access | Each resource protected separately |
| User experience | Slow, drops connection | Transparent with tools like Tailscale |
| Fit for cloud | Poor (designed for perimeter) | Natural fit |
VPNs were built for perimeter security. When your infrastructure is multi-cloud and your apps are microservices, the perimeter doesn't exist anymore. Zero Trust is the only model that fits.
Start with these two changes that have the highest impact: enable Network Policies in Kubernetes (default deny) and enforce IRSA instead of shared IAM roles. Everything else builds from there.
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
What is a Service Mesh? Explained Simply (No Jargon)
Service mesh sounds complicated but the concept is simple. Here's what it actually does, why teams use it, and whether you need one — explained without the buzzwords.
What is a Kubernetes Network Policy — Explained Simply
By default, all pods in Kubernetes can talk to each other. Network Policies let you control exactly which pods can communicate. Here's how they work with practical examples.
What is mTLS? Mutual TLS Explained Simply (with Kubernetes Examples)
mTLS means both sides of a connection verify each other's identity. It's the backbone of zero-trust networking in Kubernetes service meshes. Here's how it works in plain language.