Victoria Metrics vs Thanos vs Cortex: Which Long-Term Prometheus Storage?

Prometheus doesn't scale to multi-cluster, long-term storage on its own. Compare Victoria Metrics, Thanos, and Cortex to pick the right solution for your scale.

Prometheus is great for single-cluster metrics, but it has two problems at scale: retention (default 15 days) and federation (querying metrics across clusters). Victoria Metrics, Thanos, and Cortex all solve this — very differently.

The Problem They're All Solving

Problem 1: Long-term retention
- Prometheus stores data locally (default: 15 days, 200GB limit)
- Production needs 90 days, sometimes 1+ year for compliance

Problem 2: Multi-cluster
- 10 clusters = 10 Prometheus servers
- No unified query layer
- Can't correlate metrics across clusters

Problem 3: High availability
- Single Prometheus = single point of failure
- Deduplication needed when running Prometheus in pairs

Victoria Metrics

VictoriaMetrics is a drop-in replacement for Prometheus that's significantly more efficient. Use vmcluster for distributed mode.

Architecture:

Prometheus (scraper) → vminsert → vmstorage (replicated)
                    ↗                      ↓
vmselect ← Grafana                     vmbackup → S3

Key advantages:

5-10x more efficient than Prometheus (uses less RAM and disk)
Handles 1M+ metrics/second with vmcluster
Prometheus-compatible query language (MetricsQL extends PromQL)
Simple horizontal scaling (add more vminsert/vmstorage nodes)

Setup:

yaml

# victoria-metrics-cluster.yaml
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMCluster
metadata:
  name: production
spec:
  retentionPeriod: "90"  # days
  replicationFactor: 2
  vmstorage:
    replicaCount: 3
    resources:
      requests:
        memory: "8Gi"
        cpu: "2"
    storage:
      volumeClaimTemplate:
        spec:
          resources:
            requests:
              storage: 500Gi
  vmselect:
    replicaCount: 2
    resources:
      requests:
        memory: "2Gi"
  vminsert:
    replicaCount: 2
    resources:
      requests:
        memory: "1Gi"

Remote write from Prometheus:

yaml

# prometheus.yml
remote_write:
  - url: http://vminsert.monitoring:8480/insert/0/prometheus/api/v1/write
    queue_config:
      max_samples_per_send: 10000
      capacity: 30000

Best for:

Single cluster or multi-cluster with a central store
Teams that want simplicity over Thanos/Cortex's complexity
Anyone hitting Prometheus memory limits

Thanos

Thanos adds a sidecar to each Prometheus instance, uploads blocks to S3, and provides a unified query layer.

Architecture:

Prometheus + Thanos Sidecar → S3 (long-term storage)
                   ↓
Thanos Store → Thanos Query → Grafana
Thanos Compact (deduplication, downsampling)

Setup with Helm:

bash

helm repo add bitnami https://charts.bitnami.com/bitnami
helm install thanos bitnami/thanos \
  --set query.enabled=true \
  --set store.enabled=true \
  --set compactor.enabled=true \
  --set bucketweb.enabled=false \
  --set storegateway.persistence.size=20Gi \
  --set minio.enabled=false \
  --set objstoreConfig="type: S3
config:
  bucket: my-thanos-bucket
  region: us-east-1
  endpoint: s3.amazonaws.com"

Multi-cluster setup:

yaml

# Each cluster runs Prometheus + sidecar
# thanos-sidecar.yaml
containers:
  - name: thanos-sidecar
    image: quay.io/thanos/thanos:v0.35.0
    args:
      - sidecar
      - --tsdb.path=/data
      - --prometheus.url=http://localhost:9090
      - --grpc-address=0.0.0.0:10901
      - --http-address=0.0.0.0:10902
      - --objstore.config-file=/etc/thanos/objstore.yaml

yaml

# Central Thanos Query discovers all sidecars
apiVersion: apps/v1
kind: Deployment
metadata:
  name: thanos-query
spec:
  template:
    spec:
      containers:
        - name: thanos-query
          args:
            - query
            - --store=thanos-sidecar-cluster1:10901
            - --store=thanos-sidecar-cluster2:10901
            - --store=thanos-store:10901
            - --query.replica-label=prometheus_replica
            - --query.auto-downsampling

Best for:

Multi-cluster environments where you keep Prometheus as the scraper
Teams already using Prometheus and want minimal disruption
When you need unlimited retention via S3

Downsides:

More components to manage (sidecar, store, query, compactor)
Compactor must be single-instance (no horizontal scaling)
Query performance can be slow for large time ranges

Cortex

Cortex is a horizontally scalable, multi-tenant Prometheus backend. It's what Grafana Cloud runs under the hood (though Grafana is migrating to Mimir, which is a fork).

Architecture:

Prometheus → Cortex Distributor → Cortex Ingester (WAL) → S3
                                      ↓
Cortex Store Gateway → Cortex Querier → Grafana

Key advantages:

True multi-tenancy (different teams get isolated metric namespaces)
Horizontal scaling for every component
Per-tenant rate limiting and quotas
Active-active HA by default

When to use Cortex:

You're building an internal metrics platform for multiple teams/products
You need per-tenant isolation (team A can't see team B's metrics)
You're at SaaS scale (millions of metrics)

Setup (simplified):

yaml

# cortex minimal config
target: all  # or run each component separately at scale
auth_enabled: false  # true for multi-tenant
 
distributor:
  ring:
    kvstore:
      store: consul
 
ingester:
  lifecycler:
    ring:
      kvstore:
        store: consul
      replication_factor: 3
 
storage:
  engine: blocks
 
blocks_storage:
  s3:
    bucket_name: my-cortex-blocks
    region: us-east-1

Best for:

Platform engineering teams building internal monitoring-as-a-service
Multi-tenant environments (SaaS products, enterprises with many teams)
When you need Kubernetes-native horizontal scaling for everything

Comparison Table

	Victoria Metrics	Thanos	Cortex
Complexity	Low	Medium	High
Multi-tenancy	No (vmcluster is single-tenant)	No	Yes (natively)
Multi-cluster	Yes (via federation)	Yes (native)	Yes (native)
Prometheus-compatible	Yes	Yes	Yes
Long-term storage	Built-in (local + S3)	S3 required	S3 required
Resource efficiency	Excellent	Good	Good
Best at	Single/small multi-cluster	Multi-cluster, existing Prometheus	Large scale, multi-tenant
Managed option	Managed VM	Grafana Cloud	Grafana Cloud

What Should You Choose?

Choose Victoria Metrics if:

You're scaling a single cluster or a handful of clusters
You're hitting Prometheus memory limits
You want the simplest possible setup with long retention

Choose Thanos if:

You have 5+ Prometheus instances across clusters/regions
You want to keep Prometheus as the scraper unchanged
You're okay managing more components

Choose Cortex/Mimir if:

You're building a shared metrics platform for multiple teams
Multi-tenancy is a requirement
You're at 1M+ metrics/second scale

For most DevOps teams: Victoria Metrics gets you 90% of what Thanos offers with half the complexity. Start there.

Resources: VictoriaMetrics docs | Thanos | Grafana Mimir (Cortex fork, actively maintained)

Victoria Metrics vs Thanos vs Cortex: Which Long-Term Prometheus Storage?

The Problem They're All Solving

Victoria Metrics

Thanos

Cortex

Comparison Table

What Should You Choose?

Stay ahead of the curve

Related Articles

AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds

Build an AI-Powered SLO Breach Predictor with Claude and Prometheus

Build an AI Alert Classifier for Grafana Using LLMs (2026)

Comments