Build an AI Capacity Forecasting Tool with Prophet + Kubernetes Metrics
Reactive autoscaling fixes problems after they happen. Build a forecasting tool using Facebook's Prophet library on historical Prometheus metrics to predict capacity needs days ahead — before traffic spikes hit.
Karpenter and HPA scale reactively — they respond to load that's already arrived. For predictable patterns (weekday traffic, month-end batch jobs, seasonal spikes), you can do better: forecast the load and pre-scale before it hits, avoiding the cold-start lag of reactive scaling entirely.
Prophet, originally built at Meta for business forecasting, handles seasonality (daily, weekly, yearly patterns) well with minimal tuning — a good fit for infrastructure metrics that follow human usage patterns.
Setup
pip install prophet pandas prometheus-api-clientStep 1: Pull Historical Metrics from Prometheus
# fetch_metrics.py
from prometheus_api_client import PrometheusConnect
from datetime import datetime, timedelta
import pandas as pd
prom = PrometheusConnect(url="http://prometheus.monitoring:9090", disable_ssl=True)
def fetch_cpu_history(namespace: str, days: int = 60) -> pd.DataFrame:
query = f'sum(rate(container_cpu_usage_seconds_total{{namespace="{namespace}"}}[5m]))'
end_time = datetime.now()
start_time = end_time - timedelta(days=days)
result = prom.custom_query_range(
query=query,
start_time=start_time,
end_time=end_time,
step="1h"
)
timestamps = [float(point[0]) for point in result[0]["values"]]
values = [float(point[1]) for point in result[0]["values"]]
df = pd.DataFrame({
"ds": pd.to_datetime(timestamps, unit="s"), # Prophet requires this exact column name
"y": values # and this one
})
return dfAt least 4-6 weeks of history is the practical minimum for Prophet to detect weekly seasonality reliably. Less than that and the forecast quality drops noticeably.
Step 2: Build and Run the Forecast
from prophet import Prophet
def forecast_capacity(df: pd.DataFrame, forecast_days: int = 7) -> pd.DataFrame:
model = Prophet(
daily_seasonality=True,
weekly_seasonality=True,
yearly_seasonality=False, # usually not enough history to trust this
changepoint_prior_scale=0.05 # lower = smoother trend, less overfit to noise
)
# Add known events that break normal patterns — sales, releases, etc.
holidays = pd.DataFrame({
"holiday": ["diwali_sale", "year_end_release"],
"ds": pd.to_datetime(["2026-10-20", "2026-12-28"]),
"lower_window": 0,
"upper_window": 3,
})
model.holidays = holidays
model.fit(df)
future = model.make_future_dataframe(periods=forecast_days * 24, freq="h")
forecast = model.predict(future)
return forecast[["ds", "yhat", "yhat_lower", "yhat_upper"]]yhat_upper matters more than yhat here — for capacity planning you want the upper confidence bound, not the median prediction, because under-provisioning is more costly than slightly over-provisioning.
Step 3: Convert Forecast Into Actionable Scaling Recommendations
def generate_scaling_plan(forecast: pd.DataFrame, current_node_capacity_cpu: float) -> list[dict]:
plan = []
future_only = forecast[forecast["ds"] > pd.Timestamp.now()]
# Group by day, find peak predicted usage
future_only["date"] = future_only["ds"].dt.date
daily_peaks = future_only.groupby("date")["yhat_upper"].max()
for date, peak_cpu in daily_peaks.items():
utilization_pct = (peak_cpu / current_node_capacity_cpu) * 100
if utilization_pct > 80:
plan.append({
"date": str(date),
"predicted_peak_cpu": round(peak_cpu, 2),
"predicted_utilization_pct": round(utilization_pct, 1),
"recommendation": "Pre-scale node pool before this date — predicted utilization exceeds 80%",
"suggested_additional_capacity_cpu": round(peak_cpu * 1.2 - current_node_capacity_cpu, 2)
})
return planStep 4: Act on It — Pre-Scaling Karpenter NodePool Limits
import subprocess
import yaml
def pre_scale_for_forecast(plan: list[dict], days_ahead: int = 1):
"""Run this daily via cron — checks if tomorrow needs pre-scaling."""
from datetime import date, timedelta
tomorrow = str(date.today() + timedelta(days=days_ahead))
for entry in plan:
if entry["date"] == tomorrow:
print(f"Pre-scaling for predicted load on {tomorrow}: "
f"+{entry['suggested_additional_capacity_cpu']} CPU needed")
# Temporarily raise Karpenter NodePool limits ahead of the predicted spike
subprocess.run([
"kubectl", "patch", "nodepool", "default",
"--type=merge",
"-p", f'{{"spec":{{"limits":{{"cpu":"{entry["predicted_peak_cpu"] * 1.3}"}}}}}}'
])Validating Forecast Accuracy Before You Trust It
Don't act on forecasts you haven't validated against your own historical data first.
from prophet.diagnostics import cross_validation, performance_metrics
def validate_model(df: pd.DataFrame, model: Prophet):
df_cv = cross_validation(model, initial="30 days", period="7 days", horizon="7 days")
metrics = performance_metrics(df_cv)
print(metrics[["horizon", "mape", "coverage"]].tail())
# mape: mean absolute percentage error — lower is better
# coverage: % of actual values that fell within your confidence interval —
# should be close to your interval width (default 80%)If MAPE is above 20-25% on your data, the seasonality patterns in your traffic may be too irregular for Prophet's defaults — try tuning changepoint_prior_scale, or consider that your workload might not have predictable enough patterns for this approach to add value over reactive autoscaling alone.
Where This Pays Off Most
Forecasting-based pre-scaling is most valuable for:
- Batch/ETL jobs with known recurring schedules (month-end reports, daily reconciliation)
- E-commerce traffic with predictable daily/weekly cycles and known sale events
- B2B SaaS with strong business-hours seasonality (near-zero load nights/weekends)
It adds little value for workloads with genuinely random, event-driven spikes (breaking news traffic, viral social posts) — for those, fast reactive scaling (Karpenter's sub-minute provisioning) matters more than forecasting.
Compare this against reactive scaling: Karpenter vs Cluster Autoscaler
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Driven Capacity Planning for Kubernetes Clusters (2026)
How to use AI and machine learning for Kubernetes capacity planning. Covers predictive autoscaling, cost optimization, tools like StormForge and Kubecost, and building custom ML models for resource forecasting.
Build an AI Kubernetes Cost Optimizer with Python and Claude API
Use AI to automatically analyze your Kubernetes resource usage, detect waste, and generate optimization recommendations. Full Python project with Claude API.
Build a Kubernetes Cost Optimization Bot with AI in 2026
Build an AI-powered bot that analyzes your Kubernetes cluster, finds idle resources, oversized pods, and unused namespaces — and gives cost-cutting recommendations.