What Are Resource Limits and Requests in Kubernetes? (2026)
Kubernetes requests and limits control how much CPU and memory a pod gets. Get them wrong and your pods get throttled, OOMKilled, or evicted. Here's how they actually work.
Resource requests and limits are one of the most important — and most misunderstood — settings in Kubernetes. Get them wrong and your pods get throttled, killed, or evict other pods.
The Simple Version
Request = what your pod is guaranteed to get. Kubernetes uses this to decide which node to schedule the pod on.
Limit = the maximum your pod is allowed to use. If it goes over, bad things happen.
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"What Happens Without Them
If you don't set requests and limits:
- Kubernetes can't schedule pods intelligently — it has no idea what resources they need
- One runaway pod can eat all CPU on a node and starve other pods
- During a memory pressure event, your pods are the first to be evicted
- Your cluster will feel unpredictably slow
Always set them. No exceptions.
CPU: Requests vs Limits in Detail
CPU is measured in millicores. 1000m = 1 CPU core.
CPU Request:
- Kubernetes places your pod on a node with at least this much CPU available
- If the node has spare CPU, your pod can use more than its request
- This is the scheduling guarantee
CPU Limit:
- Your pod cannot use more CPU than this
- If it tries to use more, the kernel throttles it — it just runs slower
- CPU throttling is silent and hard to detect
resources:
requests:
cpu: "250m" # needs at least 0.25 cores to schedule
limits:
cpu: "500m" # will be throttled if it tries to use more than 0.5 coresCommon mistake: Setting CPU limits too low causes mysterious slowness in your app because it's constantly being throttled.
To check CPU throttling:
# Look at container_cpu_cfs_throttled_seconds_total in Prometheus
kubectl top pods -n your-namespaceMemory: Requests vs Limits in Detail
Memory is measured in bytes. Use Mi (mebibytes) or Gi (gibibytes).
Memory Request:
- Scheduling guarantee — pod goes to a node with at least this much free memory
Memory Limit:
- Hard ceiling — if your pod tries to use more, the kernel kills it with
OOMKilled - There is no "soft" memory limit like CPU throttling — it just gets killed
resources:
requests:
memory: "128Mi" # needs 128MiB free to schedule
limits:
memory: "256Mi" # will be OOMKilled if it uses more than 256MiBOOMKilled = Out Of Memory Killed. Your pod restarts. If it happens repeatedly, you get CrashLoopBackOff.
# Check if a pod was OOMKilled
kubectl describe pod <pod-name> | grep -A5 "OOMKilled"
kubectl get events --field-selector reason=OOMKillingHow Kubernetes Uses Requests for Scheduling
When you create a pod, the scheduler finds a node where:
Node allocatable CPU-Sum of all pod CPU requests on that node>=New pod's CPU requestNode allocatable memory-Sum of all pod memory requests>=New pod's memory request
Important: Kubernetes schedules based on requests, not actual usage. A node can look "full" to the scheduler even if pods are using much less.
# See how much capacity is allocated on each node
kubectl describe nodes | grep -A5 "Allocated resources"QoS Classes — How Kubernetes Prioritizes Eviction
Kubernetes assigns every pod a Quality of Service class based on its resources:
Guaranteed — requests = limits for all containers. Safest. Evicted last.
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"Burstable — limits > requests, or only limits set. Middle priority.
resources:
requests:
memory: "128Mi"
limits:
memory: "256Mi"BestEffort — no requests or limits set at all. Evicted first under memory pressure.
During a memory crunch, Kubernetes evicts BestEffort pods first, then Burstable, then Guaranteed. This is why production pods should always be Guaranteed or Burstable.
Recommended Starting Values
If you don't know where to start, deploy without limits first, monitor actual usage, then set:
- Request = P50 (median) actual usage
- Limit = P99 actual usage with 20% headroom
# Check actual resource usage with kubectl top
kubectl top pods -n production --sort-by=memory
# Or use Vertical Pod Autoscaler in recommendation mode
kubectl describe vpa <vpa-name>LimitRange — Set Defaults for a Namespace
Instead of setting limits on every pod, set a namespace default:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: my-app
spec:
limits:
- default:
cpu: "500m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
type: ContainerNow any pod deployed without explicit resource settings gets these defaults.
Key takeaways:
- Requests = scheduling guarantee, limits = hard ceiling
- CPU over limit = throttled (slow), memory over limit = killed (OOMKilled)
- No resources set = BestEffort = first to be evicted
- Set requests based on actual P50 usage, limits on P99 + headroom
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Build a Kubernetes Cluster with kubeadm from Scratch (2026)
Step-by-step guide to building a real multi-node Kubernetes cluster using kubeadm — no managed services, no shortcuts.
How to Build a DevOps Home Lab for Free in 2026
You don't need expensive hardware to practice DevOps. Here's how to build a complete home lab with Kubernetes, CI/CD, and monitoring using free tools and cloud free tiers.
How to Crack the CKA Exam in 2026: Study Plan, Resources, and Tips
Complete CKA exam prep guide for 2026 — what to study, how to practice, which resources actually help, and tips to pass on the first attempt.