What Are Taints and Tolerations in Kubernetes? (2026)
Taints and tolerations control which pods can run on which nodes. Here's how they work, why you need them, and real examples for GPU nodes, spot instances, and dedicated workloads.
Taints and tolerations let you control which pods are allowed to run on which nodes. They're the way Kubernetes says "this node is reserved for special workloads only."
The Simple Explanation
Think of a taint as a "no entry" sign on a node. By default, no pods can run on a tainted node.
A toleration is like a special pass. A pod with the right toleration can ignore the taint and run on that node.
Why You Need Them
Common use cases:
- GPU nodes should only run ML training jobs, not random web servers
- Spot/preemptible instances should only run fault-tolerant workloads
- Nodes with specialized hardware (fast SSD, large memory) should be reserved for specific services
- System components like CNI plugins, logging agents, and GPU operators need to run on every node including "restricted" ones
Without taints, any pod could be scheduled on any node. A big ML training job could end up on your API server's node and starve it of resources.
How to Taint a Node
# Syntax: kubectl taint nodes <node-name> <key>=<value>:<effect>
kubectl taint nodes gpu-node-1 gpu=true:NoSchedule
# Taint all nodes in a node group (with a label selector)
kubectl taint nodes -l node-type=gpu gpu=true:NoScheduleThe three effects:
NoSchedule— new pods without the toleration won't be scheduled herePreferNoSchedule— Kubernetes tries to avoid scheduling here, but will if no other optionsNoExecute— existing pods without toleration are evicted immediately; new ones aren't scheduled
How to Tolerate a Taint (Pod Side)
apiVersion: v1
kind: Pod
metadata:
name: ml-training-job
spec:
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
containers:
- name: trainer
image: pytorch/pytorch:latest
resources:
limits:
nvidia.com/gpu: "1"This pod has a toleration that matches the taint gpu=true:NoSchedule. It can now be scheduled on the GPU node.
Operator types:
Equal— key AND value must matchExists— only key must match (value is ignored)
# Match any taint with key "gpu" regardless of value
tolerations:
- key: "gpu"
operator: "Exists"
effect: "NoSchedule"Real Example 1: GPU Nodes
# Taint GPU nodes so only GPU workloads run there
kubectl taint nodes \
gpu-node-1 gpu-node-2 gpu-node-3 \
nvidia.com/gpu=true:NoSchedule# GPU workload — tolerates the taint
apiVersion: apps/v1
kind: Deployment
metadata:
name: ml-inference
spec:
template:
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
containers:
- name: inference
image: myml:latest
resources:
limits:
nvidia.com/gpu: "1"Without the toleration, a regular web app pod accidentally scheduled on a GPU node would waste expensive GPU hardware by never using it.
Real Example 2: Spot Instances
# Mark spot instances with a taint
kubectl taint nodes spot-node-1 spot=true:NoSchedule# Batch job that tolerates spot instances
spec:
tolerations:
- key: spot
value: "true"
operator: Equal
effect: NoSchedule
# Also handle sudden eviction gracefully
terminationGracePeriodSeconds: 30Critical services (databases, payment APIs) don't get this toleration, so they never land on spot instances that can be terminated with 2 minutes notice.
Real Example 3: System DaemonSets
System-level DaemonSets (Fluentd, node exporters, CNI plugins) need to run on every node — including tainted ones. They use broad tolerations:
# Tolerate everything — run on all nodes
tolerations:
- operator: Exists
effect: NoSchedule
- operator: Exists
effect: NoExecute
- operator: Exists
effect: PreferNoScheduleThis is why when you look at system pods like kube-proxy or aws-node, they have very permissive tolerations.
Taints vs Node Affinity — What's the Difference?
People often confuse taints with node affinity. They work together but do opposite things:
Taint — repels pods FROM a node. The node says "I don't want random pods."
Node Affinity — attracts pods TO a node. The pod says "I want to run on nodes with this label."
For GPU workloads, you need both:
- Taint GPU nodes so non-GPU pods don't land there
- Use node affinity/selector to make sure GPU pods land on GPU nodes
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
nodeSelector:
nvidia.com/gpu: "true" # also positively select GPU nodesToleration alone lets the pod run on GPU nodes, but doesn't guarantee it. Node selector ensures it.
Removing a Taint
# Add a minus at the end to remove the taint
kubectl taint nodes gpu-node-1 gpu=true:NoSchedule-
# Verify
kubectl describe node gpu-node-1 | grep TaintsQuick Reference
# View taints on all nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# View taints on a specific node
kubectl describe node my-node | grep Taints
# Add a taint
kubectl taint nodes my-node key=value:NoSchedule
# Remove a taint
kubectl taint nodes my-node key=value:NoSchedule-
# Add taint to all nodes with a label
kubectl taint nodes -l node-role=worker dedicated=backend:NoScheduleStay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Build a Kubernetes Cluster with kubeadm from Scratch (2026)
Step-by-step guide to building a real multi-node Kubernetes cluster using kubeadm — no managed services, no shortcuts.
How to Build a DevOps Home Lab for Free in 2026
You don't need expensive hardware to practice DevOps. Here's how to build a complete home lab with Kubernetes, CI/CD, and monitoring using free tools and cloud free tiers.
How to Crack the CKA Exam in 2026: Study Plan, Resources, and Tips
Complete CKA exam prep guide for 2026 — what to study, how to practice, which resources actually help, and tips to pass on the first attempt.