What is a Linux Process and Thread? Explained Simply for DevOps Engineers
Processes, threads, PIDs, and signals ā these come up constantly in DevOps work. Here's a clear explanation with real examples you'll actually use.
Understanding processes and threads is fundamental to debugging container issues, understanding Kubernetes resource limits, and knowing why your application behaves the way it does.
What is a Process?
A process is a running instance of a program. When you run nginx, a Python script, or a Java app, the OS creates a process for it.
Each process has:
- A unique PID (Process ID)
- Its own memory space (no other process can read its memory without explicit sharing)
- File descriptors (open files, sockets)
- A parent process (the process that started it)
# List all processes
ps aux
# Your process list looks like:
# USER PID %CPU %MEM VSZ RSS COMMAND
# root 1 0.0 0.0 4236 764 /sbin/init
# root 412 0.0 0.2 12548 4792 nginx: master process
# www-data 413 0.0 0.1 12980 2048 nginx: worker processIn Docker/Kubernetes, the process with PID 1 is special ā it receives all signals and is responsible for reaping zombie processes.
What is a Thread?
A thread is a unit of execution within a process. A single process can have multiple threads.
The key difference:
- Processes have separate memory ā changing memory in process A doesn't affect process B
- Threads share the same memory within a process ā thread A can read/write memory that thread B uses
Process (nginx master):
āāā Thread 1 (event loop)
āāā Thread 2 (event loop)
āāā Thread 3 (event loop)
All share the same memory space
Why use threads?
- They're faster to create than processes (shared memory = no copy needed)
- Good for concurrent I/O operations (web server handling multiple requests)
- Shared memory makes communication between threads simple
Why use multiple processes instead?
- Processes are isolated ā a crash in one doesn't take down others
- Better security (memory isolation)
- Better for multi-core CPUs (OS can schedule different processes on different cores)
nginx uses a multi-process model: one master process + multiple worker processes. If one worker crashes, the master restarts it without affecting other workers.
Node.js is single-threaded by default ā one thread handles all requests using an event loop. That's why CPU-heavy work blocks everything in Node.
Commands DevOps Engineers Use Daily
# Find a process by name
ps aux | grep nginx
pgrep nginx # just returns PID(s)
# Get detailed info about a process
ps -p 1234 -o pid,ppid,cmd,cpu,mem
# See process tree (parent-child relationships)
pstree -p
pstree -p 1234 # tree starting from PID 1234
# Real-time process monitor
top
htop # better than top, install with: apt install htop
# How many threads does a process have?
ps -p 1234 -o nlwp # nlwp = number of lightweight processes (threads)
# List all threads of a process
ps -p 1234 -T
# Or with top:
top -H -p 1234 # -H shows individual threadsSignals: How Processes Communicate
Signals are messages sent to processes. As a DevOps engineer, you need to know:
| Signal | Number | What it does |
|---|---|---|
| SIGTERM | 15 | Graceful shutdown request ā process can handle and clean up |
| SIGKILL | 9 | Force kill ā cannot be caught or ignored |
| SIGHUP | 1 | Hang up ā often used to reload config |
| SIGINT | 2 | Interrupt ā what Ctrl+C sends |
| SIGSTOP | 19 | Pause process execution |
| SIGCONT | 18 | Resume paused process |
# Send SIGTERM (graceful shutdown)
kill 1234
kill -15 1234
kill -TERM 1234
# Force kill (when SIGTERM doesn't work)
kill -9 1234
kill -KILL 1234
# Kill all processes named nginx
pkill nginx
pkill -9 nginx
# Reload nginx config (SIGHUP)
kill -HUP $(pgrep nginx)
nginx -s reload # easierIn Kubernetes: When a pod is terminated, Kubernetes sends SIGTERM to PID 1. If the process doesn't exit within terminationGracePeriodSeconds (default 30s), Kubernetes sends SIGKILL.
This is why your application must handle SIGTERM properly:
# Python: handle SIGTERM for graceful shutdown
import signal
import sys
def graceful_shutdown(signum, frame):
print("Received SIGTERM, shutting down gracefully...")
# Close DB connections, flush buffers, etc.
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)Zombie Processes
A zombie is a process that has finished but its exit status hasn't been read by its parent. Zombies take up a PID slot but no CPU/memory.
# See zombie processes
ps aux | grep 'Z'
# STATE column shows 'Z' for zombiesIn Docker containers, this happens when PID 1 doesn't properly reap child processes. Fix: use tini as your init process:
FROM ubuntu:22.04
RUN apt-get install -y tini
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["/app/myapp"]Kubernetes also supports this:
spec:
shareProcessNamespace: false
containers:
- name: app
securityContext:
runAsUser: 1000Background Processes and Jobs
# Run a process in background
./long-running-script.sh &
# See background jobs
jobs
# Bring background job to foreground
fg %1
# Send running process to background
Ctrl+Z # suspend
bg # run in background
# Run process that survives terminal close
nohup ./script.sh &
nohup ./script.sh > /var/log/myapp.log 2>&1 &
# Better option: use screen or tmux
screen -S mysession
tmux new -s mysessionProcess Resource Usage
# CPU and memory of a specific process
ps -p 1234 -o pid,pcpu,pmem,vsz,rss,cmd
# VSZ = virtual memory (total address space)
# RSS = resident set size (actual RAM used right now)
# File descriptors (useful for debugging "too many open files")
ls /proc/1234/fd | wc -l
lsof -p 1234 | wc -l
# Limit file descriptors
ulimit -n 65535 # set for current shell
# Permanent: /etc/security/limits.confIn Kubernetes Context
When you set resources.limits.memory: 512Mi, Kubernetes uses cgroups to enforce this. If the process exceeds 512Mi, it's OOMKilled ā Linux sends SIGKILL.
When you set resources.limits.cpu: "0.5", Kubernetes throttles the CPU time available to the container's cgroup ā it doesn't kill the process, but it slows down proportionally.
Understanding this helps you interpret:
OOMKilledā process used more RAM than the memory limit- High latency ā process is CPU throttled (check CPU throttling in Prometheus/Grafana)
These fundamentals show up in real incidents constantly. The better you understand them, the faster you debug.
Today I Fixed
Short real fixes from production ā posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam ā just practical engineering content.
Related Articles
Build a Kubernetes Cluster with kubeadm from Scratch (2026)
Step-by-step guide to building a real multi-node Kubernetes cluster using kubeadm ā no managed services, no shortcuts.
How to Set Up Ansible from Scratch (Complete Beginner Guide 2026)
Learn Ansible from zero ā install it, configure SSH, write your first playbook, use variables and loops, and automate real server tasks step by step.
How to Set Up GitLab CI/CD from Scratch (2026 Complete Tutorial)
A practical step-by-step guide to setting up GitLab CI/CD pipelines from zero ā covering runners, pipeline stages, Docker builds, deployment to Kubernetes, and best practices.