Victoria Metrics vs Thanos vs Cortex: Which Long-Term Prometheus Storage?
Prometheus doesn't scale to multi-cluster, long-term storage on its own. Compare Victoria Metrics, Thanos, and Cortex to pick the right solution for your scale.
Prometheus is great for single-cluster metrics, but it has two problems at scale: retention (default 15 days) and federation (querying metrics across clusters). Victoria Metrics, Thanos, and Cortex all solve this — very differently.
The Problem They're All Solving
Problem 1: Long-term retention
- Prometheus stores data locally (default: 15 days, 200GB limit)
- Production needs 90 days, sometimes 1+ year for compliance
Problem 2: Multi-cluster
- 10 clusters = 10 Prometheus servers
- No unified query layer
- Can't correlate metrics across clusters
Problem 3: High availability
- Single Prometheus = single point of failure
- Deduplication needed when running Prometheus in pairs
Victoria Metrics
VictoriaMetrics is a drop-in replacement for Prometheus that's significantly more efficient. Use vmcluster for distributed mode.
Architecture:
Prometheus (scraper) → vminsert → vmstorage (replicated)
↗ ↓
vmselect ← Grafana vmbackup → S3
Key advantages:
- 5-10x more efficient than Prometheus (uses less RAM and disk)
- Handles 1M+ metrics/second with vmcluster
- Prometheus-compatible query language (MetricsQL extends PromQL)
- Simple horizontal scaling (add more vminsert/vmstorage nodes)
Setup:
# victoria-metrics-cluster.yaml
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMCluster
metadata:
name: production
spec:
retentionPeriod: "90" # days
replicationFactor: 2
vmstorage:
replicaCount: 3
resources:
requests:
memory: "8Gi"
cpu: "2"
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 500Gi
vmselect:
replicaCount: 2
resources:
requests:
memory: "2Gi"
vminsert:
replicaCount: 2
resources:
requests:
memory: "1Gi"Remote write from Prometheus:
# prometheus.yml
remote_write:
- url: http://vminsert.monitoring:8480/insert/0/prometheus/api/v1/write
queue_config:
max_samples_per_send: 10000
capacity: 30000Best for:
- Single cluster or multi-cluster with a central store
- Teams that want simplicity over Thanos/Cortex's complexity
- Anyone hitting Prometheus memory limits
Thanos
Thanos adds a sidecar to each Prometheus instance, uploads blocks to S3, and provides a unified query layer.
Architecture:
Prometheus + Thanos Sidecar → S3 (long-term storage)
↓
Thanos Store → Thanos Query → Grafana
Thanos Compact (deduplication, downsampling)
Setup with Helm:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install thanos bitnami/thanos \
--set query.enabled=true \
--set store.enabled=true \
--set compactor.enabled=true \
--set bucketweb.enabled=false \
--set storegateway.persistence.size=20Gi \
--set minio.enabled=false \
--set objstoreConfig="type: S3
config:
bucket: my-thanos-bucket
region: us-east-1
endpoint: s3.amazonaws.com"Multi-cluster setup:
# Each cluster runs Prometheus + sidecar
# thanos-sidecar.yaml
containers:
- name: thanos-sidecar
image: quay.io/thanos/thanos:v0.35.0
args:
- sidecar
- --tsdb.path=/data
- --prometheus.url=http://localhost:9090
- --grpc-address=0.0.0.0:10901
- --http-address=0.0.0.0:10902
- --objstore.config-file=/etc/thanos/objstore.yaml# Central Thanos Query discovers all sidecars
apiVersion: apps/v1
kind: Deployment
metadata:
name: thanos-query
spec:
template:
spec:
containers:
- name: thanos-query
args:
- query
- --store=thanos-sidecar-cluster1:10901
- --store=thanos-sidecar-cluster2:10901
- --store=thanos-store:10901
- --query.replica-label=prometheus_replica
- --query.auto-downsamplingBest for:
- Multi-cluster environments where you keep Prometheus as the scraper
- Teams already using Prometheus and want minimal disruption
- When you need unlimited retention via S3
Downsides:
- More components to manage (sidecar, store, query, compactor)
- Compactor must be single-instance (no horizontal scaling)
- Query performance can be slow for large time ranges
Cortex
Cortex is a horizontally scalable, multi-tenant Prometheus backend. It's what Grafana Cloud runs under the hood (though Grafana is migrating to Mimir, which is a fork).
Architecture:
Prometheus → Cortex Distributor → Cortex Ingester (WAL) → S3
↓
Cortex Store Gateway → Cortex Querier → Grafana
Key advantages:
- True multi-tenancy (different teams get isolated metric namespaces)
- Horizontal scaling for every component
- Per-tenant rate limiting and quotas
- Active-active HA by default
When to use Cortex:
- You're building an internal metrics platform for multiple teams/products
- You need per-tenant isolation (team A can't see team B's metrics)
- You're at SaaS scale (millions of metrics)
Setup (simplified):
# cortex minimal config
target: all # or run each component separately at scale
auth_enabled: false # true for multi-tenant
distributor:
ring:
kvstore:
store: consul
ingester:
lifecycler:
ring:
kvstore:
store: consul
replication_factor: 3
storage:
engine: blocks
blocks_storage:
s3:
bucket_name: my-cortex-blocks
region: us-east-1Best for:
- Platform engineering teams building internal monitoring-as-a-service
- Multi-tenant environments (SaaS products, enterprises with many teams)
- When you need Kubernetes-native horizontal scaling for everything
Comparison Table
| Victoria Metrics | Thanos | Cortex | |
|---|---|---|---|
| Complexity | Low | Medium | High |
| Multi-tenancy | No (vmcluster is single-tenant) | No | Yes (natively) |
| Multi-cluster | Yes (via federation) | Yes (native) | Yes (native) |
| Prometheus-compatible | Yes | Yes | Yes |
| Long-term storage | Built-in (local + S3) | S3 required | S3 required |
| Resource efficiency | Excellent | Good | Good |
| Best at | Single/small multi-cluster | Multi-cluster, existing Prometheus | Large scale, multi-tenant |
| Managed option | Managed VM | Grafana Cloud | Grafana Cloud |
What Should You Choose?
Choose Victoria Metrics if:
- You're scaling a single cluster or a handful of clusters
- You're hitting Prometheus memory limits
- You want the simplest possible setup with long retention
Choose Thanos if:
- You have 5+ Prometheus instances across clusters/regions
- You want to keep Prometheus as the scraper unchanged
- You're okay managing more components
Choose Cortex/Mimir if:
- You're building a shared metrics platform for multiple teams
- Multi-tenancy is a requirement
- You're at 1M+ metrics/second scale
For most DevOps teams: Victoria Metrics gets you 90% of what Thanos offers with half the complexity. Start there.
Resources: VictoriaMetrics docs | Thanos | Grafana Mimir (Cortex fork, actively maintained)
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
Build an AI-Powered SLO Breach Predictor with Claude and Prometheus
Build an SLO breach predictor that reads error budget burn rate from Prometheus, uses Claude to analyze patterns, and sends Slack alerts before SLOs breach — not after.
Build an AI Alert Classifier for Grafana Using LLMs (2026)
Tired of noisy Grafana alerts that wake you up for nothing? Build an AI layer that classifies incoming alerts as actionable or noise, enriches them with context, and routes them intelligently — using Claude or GPT-4 as the reasoning engine.