Build an AI-Powered Incident Report Generator with Claude API (2026)

Writing postmortems takes 2-3 hours. Here's how to build an AI tool that generates a structured incident report from Slack logs, metrics screenshots, and alert data in minutes.

Postmortems are valuable but writing them is painful. This tool takes raw incident data — Slack thread, alert timeline, metrics — and generates a structured postmortem draft in under 2 minutes.

What We're Building

An incident report generator that:

Accepts: incident description, timeline of events, systems affected, impact
Generates: structured postmortem with root cause analysis, timeline, action items
Outputs: Markdown document ready for your wiki (Confluence, Notion, GitHub)
API: FastAPI endpoint + simple web UI

Setup

bash

mkdir incident-reporter && cd incident-reporter
pip install anthropic fastapi uvicorn python-multipart jinja2

bash

export ANTHROPIC_API_KEY=sk-ant-your-key-here

Core Generator

python

# generator.py
import anthropic
from dataclasses import dataclass
from typing import Optional
 
client = anthropic.Anthropic()
 
POSTMORTEM_SYSTEM_PROMPT = """You are a senior SRE writing a blameless postmortem report.
 
Your reports follow this structure:
1. **Incident Summary** — 2-3 sentence overview
2. **Impact** — who was affected, for how long, severity
3. **Timeline** — chronological events with timestamps
4. **Root Cause** — technical root cause, not blame
5. **Contributing Factors** — what made this worse or harder to detect
6. **Resolution** — what fixed it
7. **Action Items** — specific, assigned, time-bound improvements
 
Rules:
- Blameless: focus on systems and processes, not individuals
- Specific: include exact error messages, metrics where provided
- Actionable: every problem identified must have a concrete action item
- Honest: if we don't know the root cause, say so clearly
 
Format as clean Markdown."""
 
@dataclass
class IncidentInput:
    title: str
    severity: str  # P0/P1/P2/P3
    start_time: str
    end_time: str
    affected_services: str
    impact_description: str
    timeline_notes: str
    slack_thread: Optional[str] = None
    alerts_fired: Optional[str] = None
    metrics_summary: Optional[str] = None
    fix_applied: Optional[str] = None
 
def generate_incident_report(incident: IncidentInput) -> str:
    user_content = f"""Generate a postmortem for this incident:
 
**Title:** {incident.title}
**Severity:** {incident.severity}
**Duration:** {incident.start_time} → {incident.end_time}
**Affected Services:** {incident.affected_services}
 
**Impact:**
{incident.impact_description}
 
**Timeline Notes:**
{incident.timeline_notes}
 
{"**Slack Thread:**" + chr(10) + incident.slack_thread if incident.slack_thread else ""}
{"**Alerts That Fired:**" + chr(10) + incident.alerts_fired if incident.alerts_fired else ""}
{"**Metrics Summary:**" + chr(10) + incident.metrics_summary if incident.metrics_summary else ""}
{"**Fix Applied:**" + chr(10) + incident.fix_applied if incident.fix_applied else ""}
 
Generate the complete postmortem following the required structure."""
 
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=2000,
        system=POSTMORTEM_SYSTEM_PROMPT,
        messages=[{"role": "user", "content": user_content}]
    )
    
    return response.content[0].text

FastAPI Backend

python

# main.py
from fastapi import FastAPI, HTTPException
from fastapi.responses import HTMLResponse
from pydantic import BaseModel
from typing import Optional
from generator import generate_incident_report, IncidentInput
 
app = FastAPI(title="Incident Report Generator")
 
class IncidentRequest(BaseModel):
    title: str
    severity: str = "P2"
    start_time: str
    end_time: str
    affected_services: str
    impact_description: str
    timeline_notes: str
    slack_thread: Optional[str] = None
    alerts_fired: Optional[str] = None
    metrics_summary: Optional[str] = None
    fix_applied: Optional[str] = None
 
@app.post("/generate")
async def generate_report(request: IncidentRequest):
    try:
        incident = IncidentInput(**request.model_dump())
        report = generate_incident_report(incident)
        return {"report": report, "word_count": len(report.split())}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
 
@app.get("/", response_class=HTMLResponse)
async def web_ui():
    return """
<!DOCTYPE html>
<html>
<head>
    <title>Incident Report Generator</title>
    <style>
        body { font-family: monospace; max-width: 900px; margin: 40px auto; padding: 20px; background: #0f0f0f; color: #e0e0e0; }
        input, textarea, select { width: 100%; padding: 8px; margin: 4px 0 12px; background: #1a1a1a; color: #e0e0e0; border: 1px solid #333; border-radius: 4px; box-sizing: border-box; }
        button { background: #7c3aed; color: white; padding: 12px 24px; border: none; border-radius: 4px; cursor: pointer; font-size: 16px; }
        button:hover { background: #6d28d9; }
        #result { margin-top: 20px; padding: 20px; background: #1a1a1a; border: 1px solid #333; border-radius: 4px; white-space: pre-wrap; display: none; }
        label { color: #888; font-size: 12px; }
        h1 { color: #7c3aed; }
    </style>
</head>
<body>
    <h1>⚡ Incident Report Generator</h1>
    <label>Incident Title</label>
    <input id="title" placeholder="e.g. API Latency Spike — Payment Service Degradation" />
    
    <label>Severity</label>
    <select id="severity">
        <option value="P0">P0 — Complete outage</option>
        <option value="P1">P1 — Major degradation</option>
        <option value="P2" selected>P2 — Partial impact</option>
        <option value="P3">P3 — Minor impact</option>
    </select>
    
    <label>Start Time</label>
    <input id="start_time" placeholder="e.g. 2026-05-07 14:32 IST" />
    
    <label>End Time</label>
    <input id="end_time" placeholder="e.g. 2026-05-07 15:48 IST" />
    
    <label>Affected Services</label>
    <input id="affected_services" placeholder="e.g. payment-api, checkout-service, order-service" />
    
    <label>Impact Description</label>
    <textarea id="impact_description" rows="3" placeholder="e.g. 23% of payment attempts failed. ~4,200 users affected. Estimated ₹8.5L in blocked transactions."></textarea>
    
    <label>Timeline Notes (paste your notes)</label>
    <textarea id="timeline_notes" rows="5" placeholder="14:32 - First alert fired&#10;14:35 - Engineer paged&#10;14:45 - Root cause identified as DB connection pool exhaustion&#10;15:30 - Fix deployed&#10;15:48 - Metrics normalized"></textarea>
    
    <label>Slack Thread (optional — paste key messages)</label>
    <textarea id="slack_thread" rows="4" placeholder="Paste relevant Slack messages from the incident channel"></textarea>
    
    <label>Alerts That Fired (optional)</label>
    <textarea id="alerts_fired" rows="3" placeholder="e.g. PaymentAPILatencyHigh, DBConnectionPoolNearLimit, ErrorRateSpikePayment"></textarea>
    
    <label>Fix Applied</label>
    <textarea id="fix_applied" rows="3" placeholder="e.g. Increased DB connection pool from 50 to 200, added connection timeout of 5s, deployed at 15:30"></textarea>
    
    <br/>
    <button onclick="generate()">Generate Postmortem</button>
    
    <div id="loading" style="display:none; margin-top:20px; color:#7c3aed;">Generating... (~30 seconds)</div>
    <div id="result"></div>
    
    <script>
    async function generate() {
        const data = {
            title: document.getElementById('title').value,
            severity: document.getElementById('severity').value,
            start_time: document.getElementById('start_time').value,
            end_time: document.getElementById('end_time').value,
            affected_services: document.getElementById('affected_services').value,
            impact_description: document.getElementById('impact_description').value,
            timeline_notes: document.getElementById('timeline_notes').value,
            slack_thread: document.getElementById('slack_thread').value,
            alerts_fired: document.getElementById('alerts_fired').value,
            fix_applied: document.getElementById('fix_applied').value,
        };
        
        document.getElementById('loading').style.display = 'block';
        document.getElementById('result').style.display = 'none';
        
        try {
            const response = await fetch('/generate', {
                method: 'POST',
                headers: {'Content-Type': 'application/json'},
                body: JSON.stringify(data)
            });
            const result = await response.json();
            document.getElementById('result').style.display = 'block';
            document.getElementById('result').textContent = result.report;
        } catch (e) {
            document.getElementById('result').textContent = 'Error: ' + e.message;
            document.getElementById('result').style.display = 'block';
        } finally {
            document.getElementById('loading').style.display = 'none';
        }
    }
    </script>
</body>
</html>
"""
 
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)

Run It

bash

python main.py
# Open http://localhost:8080

Slack Bot Integration

Make it trigger directly from Slack:

python

# slack_bot.py
from slack_bolt import App
from generator import generate_incident_report, IncidentInput
 
slack_app = App(token=os.environ["SLACK_BOT_TOKEN"])
 
@slack_app.command("/postmortem")
def handle_postmortem(ack, body, client):
    ack()
    # Open a modal to collect incident details
    client.views_open(
        trigger_id=body["trigger_id"],
        view={
            "type": "modal",
            "callback_id": "postmortem_submit",
            "title": {"type": "plain_text", "text": "Generate Postmortem"},
            "submit": {"type": "plain_text", "text": "Generate"},
            "blocks": [
                {
                    "type": "input",
                    "block_id": "title",
                    "element": {"type": "plain_text_input", "action_id": "value"},
                    "label": {"type": "plain_text", "text": "Incident Title"}
                },
                # ... more fields
            ]
        }
    )

Deploy to Kubernetes

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: incident-reporter
  namespace: internal-tools
spec:
  replicas: 1
  selector:
    matchLabels:
      app: incident-reporter
  template:
    metadata:
      labels:
        app: incident-reporter
    spec:
      containers:
      - name: app
        image: your-registry/incident-reporter:latest
        ports:
        - containerPort: 8080
        env:
        - name: ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: anthropic-secret
              key: api-key
        resources:
          limits:
            cpu: 500m
            memory: 512Mi

Sample Output

Given minimal input, the generator produces:

markdown

# Postmortem: API Latency Spike — Payment Service Degradation
 
**Severity:** P2 | **Duration:** 76 minutes | **Date:** 2026-05-07
 
## Incident Summary
The payment service experienced elevated latency and a 23% error rate between
14:32 and 15:48 IST on May 7, 2026, due to database connection pool exhaustion.
Approximately 4,200 users were affected, with an estimated ₹8.5L in blocked
payment transactions.
 
## Impact
- **Users affected:** ~4,200
- **Error rate:** 23% of payment attempts failed
- **Duration:** 76 minutes
- **Business impact:** ~₹8.5L in blocked transactions
 
## Timeline
| Time (IST) | Event |
|---|---|
| 14:32 | PaymentAPILatencyHigh alert fired |
| 14:35 | On-call engineer paged |
...
 
## Root Cause
Database connection pool exhausted at 50 connections under increased load...
 
## Action Items
| Action | Owner | Due |
|---|---|---|
| Increase DB connection pool limit to 200 | Platform team | 2026-05-14 |
| Add connection pool monitoring alert | Observability team | 2026-05-14 |
...

Writing postmortems that are actually useful is hard. AI handles the structure and boilerplate — you focus on the insights and action items. Full source code available on GitHub.

Build an AI-Powered Incident Report Generator with Claude API (2026)

What We're Building

Setup

Core Generator

FastAPI Backend

Run It

Slack Bot Integration

Deploy to Kubernetes

Sample Output

Stay ahead of the curve

Related Articles

Agentic SRE Will Replace Traditional Incident Response by 2028

AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds

AI-Powered Log Analysis Is Replacing Manual Debugging in DevOps (2026)

Comments