What is a Linux Process and Thread? Explained Simply for DevOps Engineers

Processes, threads, PIDs, and signals — these come up constantly in DevOps work. Here's a clear explanation with real examples you'll actually use.

Understanding processes and threads is fundamental to debugging container issues, understanding Kubernetes resource limits, and knowing why your application behaves the way it does.

What is a Process?

A process is a running instance of a program. When you run nginx, a Python script, or a Java app, the OS creates a process for it.

Each process has:

A unique PID (Process ID)
Its own memory space (no other process can read its memory without explicit sharing)
File descriptors (open files, sockets)
A parent process (the process that started it)

bash

# List all processes
ps aux
 
# Your process list looks like:
# USER       PID %CPU %MEM    VSZ   RSS COMMAND
# root         1  0.0  0.0   4236   764 /sbin/init
# root       412  0.0  0.2  12548  4792 nginx: master process
# www-data   413  0.0  0.1  12980  2048 nginx: worker process

In Docker/Kubernetes, the process with PID 1 is special — it receives all signals and is responsible for reaping zombie processes.

What is a Thread?

A thread is a unit of execution within a process. A single process can have multiple threads.

The key difference:

Processes have separate memory — changing memory in process A doesn't affect process B
Threads share the same memory within a process — thread A can read/write memory that thread B uses

Process (nginx master):
├── Thread 1 (event loop)
├── Thread 2 (event loop)  
└── Thread 3 (event loop)
     All share the same memory space

Why use threads?

They're faster to create than processes (shared memory = no copy needed)
Good for concurrent I/O operations (web server handling multiple requests)
Shared memory makes communication between threads simple

Why use multiple processes instead?

Processes are isolated — a crash in one doesn't take down others
Better security (memory isolation)
Better for multi-core CPUs (OS can schedule different processes on different cores)

nginx uses a multi-process model: one master process + multiple worker processes. If one worker crashes, the master restarts it without affecting other workers.

Node.js is single-threaded by default — one thread handles all requests using an event loop. That's why CPU-heavy work blocks everything in Node.

Commands DevOps Engineers Use Daily

bash

# Find a process by name
ps aux | grep nginx
pgrep nginx       # just returns PID(s)
 
# Get detailed info about a process
ps -p 1234 -o pid,ppid,cmd,cpu,mem
 
# See process tree (parent-child relationships)
pstree -p
pstree -p 1234  # tree starting from PID 1234
 
# Real-time process monitor
top
htop            # better than top, install with: apt install htop
 
# How many threads does a process have?
ps -p 1234 -o nlwp  # nlwp = number of lightweight processes (threads)
 
# List all threads of a process
ps -p 1234 -T
 
# Or with top:
top -H -p 1234  # -H shows individual threads

Signals: How Processes Communicate

Signals are messages sent to processes. As a DevOps engineer, you need to know:

Signal	Number	What it does
SIGTERM	15	Graceful shutdown request — process can handle and clean up
SIGKILL	9	Force kill — cannot be caught or ignored
SIGHUP	1	Hang up — often used to reload config
SIGINT	2	Interrupt — what Ctrl+C sends
SIGSTOP	19	Pause process execution
SIGCONT	18	Resume paused process

bash

# Send SIGTERM (graceful shutdown)
kill 1234
kill -15 1234
kill -TERM 1234
 
# Force kill (when SIGTERM doesn't work)
kill -9 1234
kill -KILL 1234
 
# Kill all processes named nginx
pkill nginx
pkill -9 nginx
 
# Reload nginx config (SIGHUP)
kill -HUP $(pgrep nginx)
nginx -s reload  # easier

In Kubernetes: When a pod is terminated, Kubernetes sends SIGTERM to PID 1. If the process doesn't exit within terminationGracePeriodSeconds (default 30s), Kubernetes sends SIGKILL.

This is why your application must handle SIGTERM properly:

python

# Python: handle SIGTERM for graceful shutdown
import signal
import sys
 
def graceful_shutdown(signum, frame):
    print("Received SIGTERM, shutting down gracefully...")
    # Close DB connections, flush buffers, etc.
    sys.exit(0)
 
signal.signal(signal.SIGTERM, graceful_shutdown)

Zombie Processes

A zombie is a process that has finished but its exit status hasn't been read by its parent. Zombies take up a PID slot but no CPU/memory.

bash

# See zombie processes
ps aux | grep 'Z'
# STATE column shows 'Z' for zombies

In Docker containers, this happens when PID 1 doesn't properly reap child processes. Fix: use tini as your init process:

dockerfile

FROM ubuntu:22.04
RUN apt-get install -y tini
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["/app/myapp"]

Kubernetes also supports this:

yaml

spec:
  shareProcessNamespace: false
  containers:
    - name: app
      securityContext:
        runAsUser: 1000

Background Processes and Jobs

bash

# Run a process in background
./long-running-script.sh &
 
# See background jobs
jobs
 
# Bring background job to foreground
fg %1
 
# Send running process to background
Ctrl+Z  # suspend
bg      # run in background
 
# Run process that survives terminal close
nohup ./script.sh &
nohup ./script.sh > /var/log/myapp.log 2>&1 &
 
# Better option: use screen or tmux
screen -S mysession
tmux new -s mysession

Process Resource Usage

bash

# CPU and memory of a specific process
ps -p 1234 -o pid,pcpu,pmem,vsz,rss,cmd
 
# VSZ = virtual memory (total address space)
# RSS = resident set size (actual RAM used right now)
 
# File descriptors (useful for debugging "too many open files")
ls /proc/1234/fd | wc -l
lsof -p 1234 | wc -l
 
# Limit file descriptors
ulimit -n 65535  # set for current shell
# Permanent: /etc/security/limits.conf

In Kubernetes Context

When you set resources.limits.memory: 512Mi, Kubernetes uses cgroups to enforce this. If the process exceeds 512Mi, it's OOMKilled — Linux sends SIGKILL.

When you set resources.limits.cpu: "0.5", Kubernetes throttles the CPU time available to the container's cgroup — it doesn't kill the process, but it slows down proportionally.

Understanding this helps you interpret:

OOMKilled → process used more RAM than the memory limit
High latency → process is CPU throttled (check CPU throttling in Prometheus/Grafana)

These fundamentals show up in real incidents constantly. The better you understand them, the faster you debug.

What is a Linux Process and Thread? Explained Simply for DevOps Engineers

What is a Process?

What is a Thread?

Commands DevOps Engineers Use Daily

Signals: How Processes Communicate

Zombie Processes

Background Processes and Jobs

Process Resource Usage

In Kubernetes Context

Stay ahead of the curve

Related Articles

Build a Kubernetes Cluster with kubeadm from Scratch (2026)

How to Set Up Ansible from Scratch (Complete Beginner Guide 2026)

How to Set Up GitLab CI/CD from Scratch (2026 Complete Tutorial)

Comments