šŸŽ‰ DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Build an AI-Powered Incident Report Generator with Claude API (2026)

Writing postmortems takes 2-3 hours. Here's how to build an AI tool that generates a structured incident report from Slack logs, metrics screenshots, and alert data in minutes.

DevOpsBoysMay 7, 20266 min read
Share:Tweet

Postmortems are valuable but writing them is painful. This tool takes raw incident data — Slack thread, alert timeline, metrics — and generates a structured postmortem draft in under 2 minutes.


What We're Building

An incident report generator that:

  • Accepts: incident description, timeline of events, systems affected, impact
  • Generates: structured postmortem with root cause analysis, timeline, action items
  • Outputs: Markdown document ready for your wiki (Confluence, Notion, GitHub)
  • API: FastAPI endpoint + simple web UI

Setup

bash
mkdir incident-reporter && cd incident-reporter
pip install anthropic fastapi uvicorn python-multipart jinja2
bash
export ANTHROPIC_API_KEY=sk-ant-your-key-here

Core Generator

python
# generator.py
import anthropic
from dataclasses import dataclass
from typing import Optional
 
client = anthropic.Anthropic()
 
POSTMORTEM_SYSTEM_PROMPT = """You are a senior SRE writing a blameless postmortem report.
 
Your reports follow this structure:
1. **Incident Summary** — 2-3 sentence overview
2. **Impact** — who was affected, for how long, severity
3. **Timeline** — chronological events with timestamps
4. **Root Cause** — technical root cause, not blame
5. **Contributing Factors** — what made this worse or harder to detect
6. **Resolution** — what fixed it
7. **Action Items** — specific, assigned, time-bound improvements
 
Rules:
- Blameless: focus on systems and processes, not individuals
- Specific: include exact error messages, metrics where provided
- Actionable: every problem identified must have a concrete action item
- Honest: if we don't know the root cause, say so clearly
 
Format as clean Markdown."""
 
@dataclass
class IncidentInput:
    title: str
    severity: str  # P0/P1/P2/P3
    start_time: str
    end_time: str
    affected_services: str
    impact_description: str
    timeline_notes: str
    slack_thread: Optional[str] = None
    alerts_fired: Optional[str] = None
    metrics_summary: Optional[str] = None
    fix_applied: Optional[str] = None
 
def generate_incident_report(incident: IncidentInput) -> str:
    user_content = f"""Generate a postmortem for this incident:
 
**Title:** {incident.title}
**Severity:** {incident.severity}
**Duration:** {incident.start_time} → {incident.end_time}
**Affected Services:** {incident.affected_services}
 
**Impact:**
{incident.impact_description}
 
**Timeline Notes:**
{incident.timeline_notes}
 
{"**Slack Thread:**" + chr(10) + incident.slack_thread if incident.slack_thread else ""}
{"**Alerts That Fired:**" + chr(10) + incident.alerts_fired if incident.alerts_fired else ""}
{"**Metrics Summary:**" + chr(10) + incident.metrics_summary if incident.metrics_summary else ""}
{"**Fix Applied:**" + chr(10) + incident.fix_applied if incident.fix_applied else ""}
 
Generate the complete postmortem following the required structure."""
 
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=2000,
        system=POSTMORTEM_SYSTEM_PROMPT,
        messages=[{"role": "user", "content": user_content}]
    )
    
    return response.content[0].text

FastAPI Backend

python
# main.py
from fastapi import FastAPI, HTTPException
from fastapi.responses import HTMLResponse
from pydantic import BaseModel
from typing import Optional
from generator import generate_incident_report, IncidentInput
 
app = FastAPI(title="Incident Report Generator")
 
class IncidentRequest(BaseModel):
    title: str
    severity: str = "P2"
    start_time: str
    end_time: str
    affected_services: str
    impact_description: str
    timeline_notes: str
    slack_thread: Optional[str] = None
    alerts_fired: Optional[str] = None
    metrics_summary: Optional[str] = None
    fix_applied: Optional[str] = None
 
@app.post("/generate")
async def generate_report(request: IncidentRequest):
    try:
        incident = IncidentInput(**request.model_dump())
        report = generate_incident_report(incident)
        return {"report": report, "word_count": len(report.split())}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
 
@app.get("/", response_class=HTMLResponse)
async def web_ui():
    return """
<!DOCTYPE html>
<html>
<head>
    <title>Incident Report Generator</title>
    <style>
        body { font-family: monospace; max-width: 900px; margin: 40px auto; padding: 20px; background: #0f0f0f; color: #e0e0e0; }
        input, textarea, select { width: 100%; padding: 8px; margin: 4px 0 12px; background: #1a1a1a; color: #e0e0e0; border: 1px solid #333; border-radius: 4px; box-sizing: border-box; }
        button { background: #7c3aed; color: white; padding: 12px 24px; border: none; border-radius: 4px; cursor: pointer; font-size: 16px; }
        button:hover { background: #6d28d9; }
        #result { margin-top: 20px; padding: 20px; background: #1a1a1a; border: 1px solid #333; border-radius: 4px; white-space: pre-wrap; display: none; }
        label { color: #888; font-size: 12px; }
        h1 { color: #7c3aed; }
    </style>
</head>
<body>
    <h1>⚔ Incident Report Generator</h1>
    <label>Incident Title</label>
    <input id="title" placeholder="e.g. API Latency Spike — Payment Service Degradation" />
    
    <label>Severity</label>
    <select id="severity">
        <option value="P0">P0 — Complete outage</option>
        <option value="P1">P1 — Major degradation</option>
        <option value="P2" selected>P2 — Partial impact</option>
        <option value="P3">P3 — Minor impact</option>
    </select>
    
    <label>Start Time</label>
    <input id="start_time" placeholder="e.g. 2026-05-07 14:32 IST" />
    
    <label>End Time</label>
    <input id="end_time" placeholder="e.g. 2026-05-07 15:48 IST" />
    
    <label>Affected Services</label>
    <input id="affected_services" placeholder="e.g. payment-api, checkout-service, order-service" />
    
    <label>Impact Description</label>
    <textarea id="impact_description" rows="3" placeholder="e.g. 23% of payment attempts failed. ~4,200 users affected. Estimated ₹8.5L in blocked transactions."></textarea>
    
    <label>Timeline Notes (paste your notes)</label>
    <textarea id="timeline_notes" rows="5" placeholder="14:32 - First alert fired&#10;14:35 - Engineer paged&#10;14:45 - Root cause identified as DB connection pool exhaustion&#10;15:30 - Fix deployed&#10;15:48 - Metrics normalized"></textarea>
    
    <label>Slack Thread (optional — paste key messages)</label>
    <textarea id="slack_thread" rows="4" placeholder="Paste relevant Slack messages from the incident channel"></textarea>
    
    <label>Alerts That Fired (optional)</label>
    <textarea id="alerts_fired" rows="3" placeholder="e.g. PaymentAPILatencyHigh, DBConnectionPoolNearLimit, ErrorRateSpikePayment"></textarea>
    
    <label>Fix Applied</label>
    <textarea id="fix_applied" rows="3" placeholder="e.g. Increased DB connection pool from 50 to 200, added connection timeout of 5s, deployed at 15:30"></textarea>
    
    <br/>
    <button onclick="generate()">Generate Postmortem</button>
    
    <div id="loading" style="display:none; margin-top:20px; color:#7c3aed;">Generating... (~30 seconds)</div>
    <div id="result"></div>
    
    <script>
    async function generate() {
        const data = {
            title: document.getElementById('title').value,
            severity: document.getElementById('severity').value,
            start_time: document.getElementById('start_time').value,
            end_time: document.getElementById('end_time').value,
            affected_services: document.getElementById('affected_services').value,
            impact_description: document.getElementById('impact_description').value,
            timeline_notes: document.getElementById('timeline_notes').value,
            slack_thread: document.getElementById('slack_thread').value,
            alerts_fired: document.getElementById('alerts_fired').value,
            fix_applied: document.getElementById('fix_applied').value,
        };
        
        document.getElementById('loading').style.display = 'block';
        document.getElementById('result').style.display = 'none';
        
        try {
            const response = await fetch('/generate', {
                method: 'POST',
                headers: {'Content-Type': 'application/json'},
                body: JSON.stringify(data)
            });
            const result = await response.json();
            document.getElementById('result').style.display = 'block';
            document.getElementById('result').textContent = result.report;
        } catch (e) {
            document.getElementById('result').textContent = 'Error: ' + e.message;
            document.getElementById('result').style.display = 'block';
        } finally {
            document.getElementById('loading').style.display = 'none';
        }
    }
    </script>
</body>
</html>
"""
 
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)

Run It

bash
python main.py
# Open http://localhost:8080

Slack Bot Integration

Make it trigger directly from Slack:

python
# slack_bot.py
from slack_bolt import App
from generator import generate_incident_report, IncidentInput
 
slack_app = App(token=os.environ["SLACK_BOT_TOKEN"])
 
@slack_app.command("/postmortem")
def handle_postmortem(ack, body, client):
    ack()
    # Open a modal to collect incident details
    client.views_open(
        trigger_id=body["trigger_id"],
        view={
            "type": "modal",
            "callback_id": "postmortem_submit",
            "title": {"type": "plain_text", "text": "Generate Postmortem"},
            "submit": {"type": "plain_text", "text": "Generate"},
            "blocks": [
                {
                    "type": "input",
                    "block_id": "title",
                    "element": {"type": "plain_text_input", "action_id": "value"},
                    "label": {"type": "plain_text", "text": "Incident Title"}
                },
                # ... more fields
            ]
        }
    )

Deploy to Kubernetes

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: incident-reporter
  namespace: internal-tools
spec:
  replicas: 1
  selector:
    matchLabels:
      app: incident-reporter
  template:
    metadata:
      labels:
        app: incident-reporter
    spec:
      containers:
      - name: app
        image: your-registry/incident-reporter:latest
        ports:
        - containerPort: 8080
        env:
        - name: ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: anthropic-secret
              key: api-key
        resources:
          limits:
            cpu: 500m
            memory: 512Mi

Sample Output

Given minimal input, the generator produces:

markdown
# Postmortem: API Latency Spike — Payment Service Degradation
 
**Severity:** P2 | **Duration:** 76 minutes | **Date:** 2026-05-07
 
## Incident Summary
The payment service experienced elevated latency and a 23% error rate between
14:32 and 15:48 IST on May 7, 2026, due to database connection pool exhaustion.
Approximately 4,200 users were affected, with an estimated ₹8.5L in blocked
payment transactions.
 
## Impact
- **Users affected:** ~4,200
- **Error rate:** 23% of payment attempts failed
- **Duration:** 76 minutes
- **Business impact:** ~₹8.5L in blocked transactions
 
## Timeline
| Time (IST) | Event |
|---|---|
| 14:32 | PaymentAPILatencyHigh alert fired |
| 14:35 | On-call engineer paged |
...
 
## Root Cause
Database connection pool exhausted at 50 connections under increased load...
 
## Action Items
| Action | Owner | Due |
|---|---|---|
| Increase DB connection pool limit to 200 | Platform team | 2026-05-14 |
| Add connection pool monitoring alert | Observability team | 2026-05-14 |
...

Writing postmortems that are actually useful is hard. AI handles the structure and boilerplate — you focus on the insights and action items. Full source code available on GitHub.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments