Build a DevOps Automation Bot with LLM Function Calling (2026)
Use Claude or GPT-4o function calling to build a DevOps bot that can check pod status, scale deployments, query logs, and trigger pipelines — all from plain English commands in Slack or terminal.
Instead of teaching your team 50 kubectl commands, what if they could just ask: "Scale the payments service to 5 replicas" or "Why is the checkout pod crashing?" and an AI does it?
This guide builds a DevOps automation bot using LLM function calling — where the model decides which real infrastructure action to take based on your natural language request.
How Function Calling Works
Function calling (also called "tool use") lets you give an LLM a set of functions it can invoke. The model reads your message, decides which function to call, and returns structured arguments — you execute the function and optionally feed the result back.
User: "How many replicas does the payments deployment have?"
│
▼
LLM decides: call get_deployment_info(namespace="default", name="payments")
│
▼
Your code runs: kubectl get deployment payments -o json
│
▼
Result fed back to LLM: {"replicas": 3, "available": 3, "image": "payments:v2.1"}
│
▼
LLM responds: "The payments deployment has 3 replicas, all available. Running image payments:v2.1."
The LLM never directly touches your infrastructure — it tells your code what to run, your code runs it safely.
Project Setup
mkdir devops-bot && cd devops-bot
pip install anthropic kubernetes python-dotenv richdevops-bot/
├── bot.py # Main bot loop
├── tools.py # DevOps tool implementations
├── k8s_client.py # Kubernetes client wrapper
└── .env # ANTHROPIC_API_KEY
Step 1: Define the DevOps Tools
# tools.py
from kubernetes import client, config
import subprocess
import json
# Load kubeconfig
try:
config.load_incluster_config() # Running inside k8s
except:
config.load_kube_config() # Running locally
apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()
def get_deployment_info(namespace: str, name: str) -> dict:
"""Get details about a Kubernetes deployment."""
try:
dep = apps_v1.read_namespaced_deployment(name=name, namespace=namespace)
return {
"name": dep.metadata.name,
"namespace": dep.metadata.namespace,
"replicas": dep.spec.replicas,
"available_replicas": dep.status.available_replicas or 0,
"ready_replicas": dep.status.ready_replicas or 0,
"image": dep.spec.template.spec.containers[0].image,
"labels": dep.metadata.labels,
}
except client.exceptions.ApiException as e:
return {"error": f"Deployment not found: {e.reason}"}
def list_pods(namespace: str, label_selector: str = "") -> dict:
"""List pods in a namespace, optionally filtered by labels."""
try:
pods = core_v1.list_namespaced_pod(
namespace=namespace,
label_selector=label_selector
)
pod_list = []
for pod in pods.items:
pod_list.append({
"name": pod.metadata.name,
"status": pod.status.phase,
"ready": all(
c.ready for c in (pod.status.container_statuses or [])
),
"restarts": sum(
c.restart_count for c in (pod.status.container_statuses or [])
),
"node": pod.spec.node_name,
})
return {"pods": pod_list, "count": len(pod_list)}
except client.exceptions.ApiException as e:
return {"error": str(e)}
def scale_deployment(namespace: str, name: str, replicas: int) -> dict:
"""Scale a deployment to the specified number of replicas."""
if replicas < 0 or replicas > 50:
return {"error": f"Replica count {replicas} is out of safe range (0-50)"}
try:
apps_v1.patch_namespaced_deployment_scale(
name=name,
namespace=namespace,
body={"spec": {"replicas": replicas}}
)
return {
"success": True,
"message": f"Scaled {namespace}/{name} to {replicas} replicas"
}
except client.exceptions.ApiException as e:
return {"error": str(e)}
def get_pod_logs(namespace: str, pod_name: str, tail_lines: int = 50) -> dict:
"""Get recent logs from a pod."""
try:
logs = core_v1.read_namespaced_pod_log(
name=pod_name,
namespace=namespace,
tail_lines=tail_lines,
timestamps=True
)
return {"logs": logs, "pod": pod_name}
except client.exceptions.ApiException as e:
return {"error": str(e)}
def get_pod_events(namespace: str, pod_name: str) -> dict:
"""Get Kubernetes events for a specific pod — useful for debugging crashes."""
try:
events = core_v1.list_namespaced_event(
namespace=namespace,
field_selector=f"involvedObject.name={pod_name}"
)
event_list = [
{
"type": e.type,
"reason": e.reason,
"message": e.message,
"count": e.count,
"last_time": str(e.last_timestamp),
}
for e in events.items
]
return {"events": event_list}
except client.exceptions.ApiException as e:
return {"error": str(e)}
def list_deployments(namespace: str) -> dict:
"""List all deployments in a namespace."""
try:
deps = apps_v1.list_namespaced_deployment(namespace=namespace)
return {
"deployments": [
{
"name": d.metadata.name,
"replicas": d.spec.replicas,
"available": d.status.available_replicas or 0,
"image": d.spec.template.spec.containers[0].image,
}
for d in deps.items
]
}
except client.exceptions.ApiException as e:
return {"error": str(e)}
# Map function names to actual functions
TOOL_FUNCTIONS = {
"get_deployment_info": get_deployment_info,
"list_pods": list_pods,
"scale_deployment": scale_deployment,
"get_pod_logs": get_pod_logs,
"get_pod_events": get_pod_events,
"list_deployments": list_deployments,
}Step 2: Define Tools for the LLM
# tool_definitions.py — what we tell Claude about our tools
TOOLS = [
{
"name": "get_deployment_info",
"description": "Get details about a specific Kubernetes deployment including replica count, available replicas, and current image.",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string", "description": "Kubernetes namespace"},
"name": {"type": "string", "description": "Deployment name"}
},
"required": ["namespace", "name"]
}
},
{
"name": "list_pods",
"description": "List pods in a Kubernetes namespace. Use label_selector to filter (e.g. 'app=payments').",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string"},
"label_selector": {"type": "string", "description": "Optional label selector like 'app=myapp'"}
},
"required": ["namespace"]
}
},
{
"name": "scale_deployment",
"description": "Scale a Kubernetes deployment to a specified number of replicas.",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string"},
"name": {"type": "string"},
"replicas": {"type": "integer", "description": "Target replica count (0-50)"}
},
"required": ["namespace", "name", "replicas"]
}
},
{
"name": "get_pod_logs",
"description": "Get recent logs from a specific pod. Use when debugging errors or crashes.",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string"},
"pod_name": {"type": "string"},
"tail_lines": {"type": "integer", "default": 50, "description": "Number of log lines to return"}
},
"required": ["namespace", "pod_name"]
}
},
{
"name": "get_pod_events",
"description": "Get Kubernetes events for a pod. Essential for diagnosing CrashLoopBackOff, OOMKilled, and scheduling failures.",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string"},
"pod_name": {"type": "string"}
},
"required": ["namespace", "pod_name"]
}
},
{
"name": "list_deployments",
"description": "List all deployments in a Kubernetes namespace with their current status.",
"input_schema": {
"type": "object",
"properties": {
"namespace": {"type": "string"}
},
"required": ["namespace"]
}
}
]Step 3: The Bot Main Loop
# bot.py
import anthropic
import json
from rich.console import Console
from rich.markdown import Markdown
from tools import TOOL_FUNCTIONS
from tool_definitions import TOOLS
console = Console()
claude = anthropic.Anthropic()
SYSTEM_PROMPT = """You are a DevOps assistant with access to Kubernetes cluster tools.
When users ask about deployments, pods, or logs — use your tools to get real data,
then explain it clearly. When scaling or modifying resources, confirm what you're
about to do before executing. Default namespace is 'default' unless specified."""
def run_tool(tool_name: str, tool_input: dict) -> str:
"""Execute a tool and return the result as a string."""
if tool_name not in TOOL_FUNCTIONS:
return json.dumps({"error": f"Unknown tool: {tool_name}"})
console.print(f"[dim]→ Running: {tool_name}({tool_input})[/dim]")
result = TOOL_FUNCTIONS[tool_name](**tool_input)
return json.dumps(result, indent=2, default=str)
def chat(messages: list) -> str:
"""Send messages to Claude and handle tool use in a loop."""
while True:
response = claude.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages,
)
# If no tool use, return the text response
if response.stop_reason == "end_turn":
text = ""
for block in response.content:
if hasattr(block, "text"):
text += block.text
return text
# Handle tool use
if response.stop_reason == "tool_use":
# Add assistant's response to messages
messages.append({"role": "assistant", "content": response.content})
# Execute each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = run_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
# Add tool results to messages
messages.append({"role": "user", "content": tool_results})
# Continue the loop — Claude will now respond with final answer
def main():
console.print("[bold cyan]DevOps Bot[/bold cyan] — Ask me about your Kubernetes cluster")
console.print("Type 'exit' to quit\n")
conversation = []
while True:
user_input = console.input("[bold green]You:[/bold green] ").strip()
if user_input.lower() in ("exit", "quit"):
break
if not user_input:
continue
conversation.append({"role": "user", "content": user_input})
response = chat(conversation)
console.print("\n[bold blue]Bot:[/bold blue]")
console.print(Markdown(response))
console.print()
conversation.append({"role": "assistant", "content": response})
if __name__ == "__main__":
main()Step 4: Run It
export ANTHROPIC_API_KEY=sk-ant-your-key
python bot.pyDevOps Bot — Ask me about your Kubernetes cluster
You: What's running in the production namespace?
→ Running: list_deployments({"namespace": "production"})
Bot:
Here's what's running in the **production** namespace:
| Deployment | Replicas | Available | Image |
|-----------|----------|-----------|-------|
| api-server | 3 | 3 | api:v2.4.1 |
| payments | 2 | 2 | payments:v1.9.0 |
| frontend | 4 | 3 | frontend:v3.1.2 |
⚠️ **frontend** has only 3/4 replicas available — one pod may be unhealthy.
You: Why is the frontend pod unhealthy?
→ Running: list_pods({"namespace": "production", "label_selector": "app=frontend"})
→ Running: get_pod_events({"namespace": "production", "pod_name": "frontend-7d9f8b-xk2p"})
Bot:
The pod **frontend-7d9f8b-xk2p** is in **CrashLoopBackOff** with 8 restarts.
Events show:
- `OOMKilled` — the container was killed because it exceeded its memory limit (256Mi)
**Recommendation:** The frontend container needs more memory. You can scale vertically
by increasing the memory limit in the deployment spec, or check for a memory leak in
the recent frontend:v3.1.2 release.
You: Scale payments to 4 replicas
→ Running: scale_deployment({"namespace": "production", "name": "payments", "replicas": 4})
Bot:
Done. The **payments** deployment has been scaled from 2 → **4 replicas** in the
production namespace. New pods will be scheduled momentarily.
Add Safety Guardrails
For production use, add confirmation for destructive actions:
DESTRUCTIVE_TOOLS = {"scale_deployment", "delete_pod", "restart_deployment"}
def run_tool_with_confirmation(tool_name: str, tool_input: dict) -> str:
if tool_name in DESTRUCTIVE_TOOLS:
console.print(f"\n[yellow]⚠ About to run: {tool_name}({tool_input})[/yellow]")
confirm = console.input("Confirm? (yes/no): ").strip().lower()
if confirm != "yes":
return json.dumps({"cancelled": "User cancelled the action"})
return run_tool(tool_name, tool_input)Deploy as a Slack Bot
To expose this as a Slack bot, wrap it in a FastAPI endpoint:
from fastapi import FastAPI, Request
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.fastapi.async_handler import AsyncSlackRequestHandler
slack_app = AsyncApp(token=os.environ["SLACK_BOT_TOKEN"])
@slack_app.message("")
async def handle_message(message, say):
user_text = message.get("text", "")
conversation = [{"role": "user", "content": user_text}]
response = chat(conversation)
await say(response)Now your team can ask Kubernetes questions directly in Slack #devops channel.
Extend with More Tools
# Easy to add more tools:
def trigger_github_actions_workflow(repo: str, workflow: str, ref: str = "main") -> dict:
"""Trigger a GitHub Actions workflow via API."""
...
def get_cloudwatch_metrics(service: str, metric: str, period_minutes: int = 30) -> dict:
"""Fetch AWS CloudWatch metrics for a service."""
...
def get_recent_deployments(namespace: str, count: int = 5) -> dict:
"""Get the last N deployments across all services."""
...Each new function you add expands what your bot can do without changing any of the LLM logic.
For deeper learning on building AI agents with tool use, the Anthropic documentation on tool use is excellent. For production-grade agent frameworks, LangChain and LlamaIndex build on the same patterns with more batteries included.
Function calling turns an LLM from a text generator into an actual operator that can read and modify your infrastructure — while keeping you in control of what actions are actually executed.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Driven Capacity Planning for Kubernetes Clusters (2026)
How to use AI and machine learning for Kubernetes capacity planning. Covers predictive autoscaling, cost optimization, tools like StormForge and Kubecost, and building custom ML models for resource forecasting.
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
Argo Rollouts vs Flagger — Which Canary Deployment Tool Should You Use? (2026)
Both Argo Rollouts and Flagger do progressive delivery on Kubernetes. Here's a detailed comparison of features, architecture, and when to pick each.