What Are Resource Limits and Requests in Kubernetes? (2026)

Kubernetes requests and limits control how much CPU and memory a pod gets. Get them wrong and your pods get throttled, OOMKilled, or evicted. Here's how they actually work.

Resource requests and limits are one of the most important — and most misunderstood — settings in Kubernetes. Get them wrong and your pods get throttled, killed, or evict other pods.

The Simple Version

Request = what your pod is guaranteed to get. Kubernetes uses this to decide which node to schedule the pod on.

Limit = the maximum your pod is allowed to use. If it goes over, bad things happen.

yaml

resources:
  requests:
    memory: "128Mi"
    cpu: "250m"
  limits:
    memory: "256Mi"
    cpu: "500m"

What Happens Without Them

If you don't set requests and limits:

Kubernetes can't schedule pods intelligently — it has no idea what resources they need
One runaway pod can eat all CPU on a node and starve other pods
During a memory pressure event, your pods are the first to be evicted
Your cluster will feel unpredictably slow

Always set them. No exceptions.

CPU: Requests vs Limits in Detail

CPU is measured in millicores. 1000m = 1 CPU core.

CPU Request:

Kubernetes places your pod on a node with at least this much CPU available
If the node has spare CPU, your pod can use more than its request
This is the scheduling guarantee

CPU Limit:

Your pod cannot use more CPU than this
If it tries to use more, the kernel throttles it — it just runs slower
CPU throttling is silent and hard to detect

yaml

resources:
  requests:
    cpu: "250m"    # needs at least 0.25 cores to schedule
  limits:
    cpu: "500m"    # will be throttled if it tries to use more than 0.5 cores

Common mistake: Setting CPU limits too low causes mysterious slowness in your app because it's constantly being throttled.

To check CPU throttling:

bash

# Look at container_cpu_cfs_throttled_seconds_total in Prometheus
kubectl top pods -n your-namespace

Memory: Requests vs Limits in Detail

Memory is measured in bytes. Use Mi (mebibytes) or Gi (gibibytes).

Memory Request:

Scheduling guarantee — pod goes to a node with at least this much free memory

Memory Limit:

Hard ceiling — if your pod tries to use more, the kernel kills it with OOMKilled
There is no "soft" memory limit like CPU throttling — it just gets killed

yaml

resources:
  requests:
    memory: "128Mi"   # needs 128MiB free to schedule
  limits:
    memory: "256Mi"   # will be OOMKilled if it uses more than 256MiB

OOMKilled = Out Of Memory Killed. Your pod restarts. If it happens repeatedly, you get CrashLoopBackOff.

bash

# Check if a pod was OOMKilled
kubectl describe pod <pod-name> | grep -A5 "OOMKilled"
kubectl get events --field-selector reason=OOMKilling

How Kubernetes Uses Requests for Scheduling

When you create a pod, the scheduler finds a node where:

Node allocatable CPU - Sum of all pod CPU requests on that node >= New pod's CPU request
Node allocatable memory - Sum of all pod memory requests >= New pod's memory request

Important: Kubernetes schedules based on requests, not actual usage. A node can look "full" to the scheduler even if pods are using much less.

bash

# See how much capacity is allocated on each node
kubectl describe nodes | grep -A5 "Allocated resources"

QoS Classes — How Kubernetes Prioritizes Eviction

Kubernetes assigns every pod a Quality of Service class based on its resources:

Guaranteed — requests = limits for all containers. Safest. Evicted last.

yaml

resources:
  requests:
    memory: "256Mi"
    cpu: "500m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Burstable — limits > requests, or only limits set. Middle priority.

yaml

resources:
  requests:
    memory: "128Mi"
  limits:
    memory: "256Mi"

BestEffort — no requests or limits set at all. Evicted first under memory pressure.

During a memory crunch, Kubernetes evicts BestEffort pods first, then Burstable, then Guaranteed. This is why production pods should always be Guaranteed or Burstable.

Recommended Starting Values

If you don't know where to start, deploy without limits first, monitor actual usage, then set:

Request = P50 (median) actual usage
Limit = P99 actual usage with 20% headroom

bash

# Check actual resource usage with kubectl top
kubectl top pods -n production --sort-by=memory
 
# Or use Vertical Pod Autoscaler in recommendation mode
kubectl describe vpa <vpa-name>

LimitRange — Set Defaults for a Namespace

Instead of setting limits on every pod, set a namespace default:

yaml

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: my-app
spec:
  limits:
  - default:
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    type: Container

Now any pod deployed without explicit resource settings gets these defaults.

Key takeaways:

Requests = scheduling guarantee, limits = hard ceiling
CPU over limit = throttled (slow), memory over limit = killed (OOMKilled)
No resources set = BestEffort = first to be evicted
Set requests based on actual P50 usage, limits on P99 + headroom

What Are Resource Limits and Requests in Kubernetes? (2026)

The Simple Version

What Happens Without Them

CPU: Requests vs Limits in Detail

Memory: Requests vs Limits in Detail

How Kubernetes Uses Requests for Scheduling

QoS Classes — How Kubernetes Prioritizes Eviction

Recommended Starting Values

LimitRange — Set Defaults for a Namespace

Stay ahead of the curve

Related Articles

Build a Kubernetes Cluster with kubeadm from Scratch (2026)

How to Build a DevOps Home Lab for Free in 2026

How to Crack the CKA Exam in 2026: Study Plan, Resources, and Tips

Comments