🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

What is Continuous Profiling? (Explained with Pyroscope — No PhD Required)

Continuous profiling tells you exactly which function is burning your CPU or leaking memory — in production, all the time. Here's what it is, how it works, and how to set it up with Pyroscope.

DevOpsBoysJun 12, 20265 min read
Share:Tweet

You have metrics. You have logs. You have traces. Your observability stack is technically complete.

And yet when your service slows down at 3 PM every Tuesday, you can't tell which function is causing it. Your CPU graph goes up. Your latency graph goes up. Your logs say nothing useful. You attach a profiler, but that requires redeploying and the issue disappears before you can reproduce it.

Continuous profiling is the missing piece. Here's what it is and why it matters.

The Observability Gap

Think of the three pillars of observability:

  • Metrics tell you that something is wrong (CPU is high, latency is up)
  • Logs tell you what happened (errors, events)
  • Traces tell you where in your service the request went (which services, which APIs)

None of these tell you why your CPU is high at the code level. Which function? Which loop? Which SQL query? Which line?

That's what profiling answers. And "continuous" profiling means you're doing it all the time — in production — without waiting for an incident to manually attach a profiler.

What a Profiler Actually Does

A profiler samples your running process at regular intervals — say, every 10 milliseconds. At each sample, it captures a stack trace: what function is currently executing, and which functions called it.

After collecting thousands of samples, you know: "Function X appeared in 40% of all samples." That means Function X consumed 40% of your CPU time. This is called sampling-based profiling and it's how most modern profilers work — low overhead, statistically accurate.

The result is usually visualized as a flame graph — a horizontal bar chart where each bar is a function call, and the width represents how much time was spent in that function.

                    ┌─────────────────────────────────────────────────────────┐
                    │ HTTP Handler (100%)                                      │
                    ├───────────────────────────────┬─────────────────────────┤
                    │ processOrder() (67%)          │ validateAuth() (33%)    │
                    ├─────────────┬─────────────────┤                         │
                    │ queryDB()   │ calculatePrice()│                         │
                    │ (55%)       │ (12%)           │                         │
                    └─────────────┴─────────────────┘─────────────────────────┘

You look at this and immediately know: queryDB() is where most time is spent. That's your optimization target.

Why "Continuous" Matters

Traditional profiling is reactive. Production slows down → you SSH in → attach a profiler → collect data for 30 seconds → analyze → production returns to normal → you have data from a 30-second window that may or may not represent the real issue.

Continuous profiling runs all the time. When you notice the slowdown 3 hours later, you can go back in time and see exactly what your code was doing at 3 PM. It's like having a DVR for your application's behavior.

This is especially powerful for:

  • Intermittent performance issues that don't reproduce on demand
  • Memory leaks that grow slowly over hours or days
  • Regression detection — compare profiles before and after a deployment
  • Cost optimization — find functions consuming CPU unnecessarily

Pyroscope: Open Source Continuous Profiling

Pyroscope (now Grafana Pyroscope) is the leading open source continuous profiling tool. It supports Go, Python, Java, Node.js, Ruby, Rust, and .NET.

It works in two modes:

  1. Pull mode — Pyroscope scrapes profiling data from your application (like Prometheus scrapes metrics)
  2. Push mode — your application sends profiling data to Pyroscope

Setting Up Pyroscope with Go

Add the Pyroscope agent to your Go application:

bash
go get github.com/grafana/pyroscope-go
go
package main
 
import (
    "github.com/grafana/pyroscope-go"
)
 
func main() {
    pyroscope.Start(pyroscope.Config{
        ApplicationName: "my-service",
        ServerAddress:   "http://pyroscope:4040",
        ProfileTypes: []pyroscope.ProfileType{
            pyroscope.ProfileCPU,
            pyroscope.ProfileAllocObjects,
            pyroscope.ProfileAllocSpace,
            pyroscope.ProfileInuseObjects,
            pyroscope.ProfileInuseSpace,
        },
    })
 
    // Your application code
}

That's it. Pyroscope now continuously samples your application and sends CPU, memory allocation, and heap profiles every 10 seconds.

Setting Up Pyroscope with Python

bash
pip install pyroscope-io
python
import pyroscope
 
pyroscope.configure(
    application_name="my-python-service",
    server_address="http://pyroscope:4040",
    tags={
        "version": "1.0.0",
        "env": "production"
    }
)
 
# Your application code runs here

Deploying Pyroscope on Kubernetes

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pyroscope
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pyroscope
  template:
    spec:
      containers:
      - name: pyroscope
        image: grafana/pyroscope:latest
        ports:
        - containerPort: 4040
        env:
        - name: PYROSCOPE_STORAGE_TYPE
          value: "s3"
        - name: PYROSCOPE_S3_BUCKET
          value: "my-pyroscope-data"
---
apiVersion: v1
kind: Service
metadata:
  name: pyroscope
spec:
  selector:
    app: pyroscope
  ports:
  - port: 4040
    targetPort: 4040

Reading a Flame Graph

Once Pyroscope is collecting data, you'll see flame graphs in the UI. Here's how to read them:

  • Width = time — wider bars mean more CPU time spent in that function
  • Vertical position = call stack — functions at the bottom called functions above them
  • Look for wide bars near the top — those are functions doing the most work
  • Look for surprisingly wide bars that you didn't expect — those are your performance wins

The color is usually meaningless (just for visual distinction) unless your tool uses color coding to indicate hot paths.

The Four Profile Types You Should Collect

  1. CPU profile — which functions consume CPU cycles. Most common, most useful.
  2. Heap/Memory profile — which functions allocate the most memory (helps find leaks)
  3. Goroutine/Thread profile — how many goroutines exist and what they're doing (helps find goroutine leaks)
  4. Blocking profile — where goroutines are blocked waiting (helps find lock contention)

Don't try to optimize everything at once. Start with CPU, find the biggest function, optimize it, measure again.

Overhead: Is It Safe for Production?

Yes — sampling profilers have very low overhead. Pyroscope's CPU overhead is typically 0.5-2% of total CPU. Memory overhead is minimal. The sampling rate (default: 100Hz for Go) can be adjusted lower if needed.

The trade-off is accuracy vs overhead: lower sampling rate = less overhead = less accurate data. For production, 10-100Hz is the standard range.


Continuous profiling is the last piece of observability most teams add, but it often delivers the highest return. When you can see exactly which function is costing you CPU, you stop guessing and start fixing.

Already using Prometheus + Grafana? Grafana Pyroscope integrates natively and lets you correlate metrics, traces, and profiles in a single view. Start there.

🔧

Today I Fixed

Short real fixes from production — posted daily

Browse fixes
Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments