Elasticsearch vs OpenSearch vs Loki for Log Storage 2026
Choosing a log storage backend? Elasticsearch, OpenSearch, and Grafana Loki have very different architectures, costs, and use cases. Here's the honest comparison with Kubernetes examples.
Three tools dominate log storage in 2026. They solve different problems, and picking the wrong one costs you months of pain and serious money.
Quick Decision Table
| If you... | Use... |
|---|---|
| Already on Elastic Cloud / ELK | Elasticsearch |
| Want open-source Elastic alternative | OpenSearch |
| Use Grafana stack (Prometheus + Grafana) | Loki |
| Need full-text search over logs | Elasticsearch or OpenSearch |
| Primary metric: storage cost | Loki (10-30x cheaper) |
| Kubernetes-native, label-based queries | Loki |
| Compliance: need field-level security | OpenSearch |
Architecture Differences
Elasticsearch
Full-text inverted index on every log field. Every log line gets parsed, indexed, stored.
- Storage: High (indexes everything)
- Query: Fastest for complex full-text search
- License: SSPL (not OSI open source since 7.11)
- Managed: Elastic Cloud, AWS Elasticsearch (legacy)
OpenSearch
AWS fork of Elasticsearch 7.10 (last Apache 2.0 version). Fully API-compatible with Elasticsearch clients.
- Storage: Similar to Elasticsearch
- Query: Near-identical to Elasticsearch
- License: Apache 2.0 (truly open source)
- Managed: AWS OpenSearch Service
Grafana Loki
Indexes only log labels (not log content). Stores compressed log chunks in object storage (S3/GCS).
- Storage: Very low (labels only indexed, content in object storage)
- Query: LogQL — fast for label-based queries, slower for full-text grep
- License: AGPL-3.0
- Managed: Grafana Cloud
Storage Cost Comparison (real numbers)
For 100 GB/day log ingestion:
| Monthly Storage | Monthly Cost (approx) | |
|---|---|---|
| Elasticsearch | ~500 GB (with index) | $300–500 (self-hosted) |
| OpenSearch | ~500 GB | $200–400 (AWS) |
| Loki | ~50–80 GB (compressed chunks) | $10–30 (S3) |
Loki's cost advantage comes from storing compressed raw logs in S3 and only indexing labels.
Kubernetes Setup: Loki Stack
# loki-values.yaml (Helm)
loki:
auth_enabled: false
storage:
type: s3
s3:
endpoint: s3.amazonaws.com
region: us-east-1
bucketnames: my-loki-logs
access_key_id: ${AWS_ACCESS_KEY_ID}
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
schema_config:
configs:
- from: "2024-01-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
limits_config:
retention_period: 30d
ingestion_rate_mb: 16
ingestion_burst_size_mb: 32
promtail:
config:
clients:
- url: http://loki:3100/loki/api/v1/pushhelm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki-stack \
--namespace monitoring \
--values loki-values.yamlKubernetes Setup: OpenSearch
# opensearch-values.yaml
opensearch:
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
persistence:
size: 100Gi
config:
opensearch.yml: |
cluster.name: logs-cluster
plugins.security.disabled: false
indices.query.bool.max_clause_count: 4096
opensearch-dashboards:
enabled: true
service:
type: ClusterIPhelm repo add opensearch https://opensearch-project.github.io/helm-charts/
helm install opensearch opensearch/opensearch \
--namespace logging \
--values opensearch-values.yamlQuery Language Comparison
Loki (LogQL)
# All error logs from a specific app in last hour
{app="payment-service", env="production"} |= "ERROR"
# Rate of errors per minute
rate({app="payment-service"} |= "ERROR" [1m])
# Parse JSON logs and filter by field
{app="api"} | json | status_code >= 500
# Count log lines by level
sum by (level) (count_over_time({app="api"} | json [5m]))Elasticsearch/OpenSearch (Query DSL)
{
"query": {
"bool": {
"must": [
{ "match": { "level": "ERROR" } },
{ "match": { "service.name": "payment-service" } }
],
"filter": [
{ "range": { "@timestamp": { "gte": "now-1h" } } }
]
}
},
"aggs": {
"errors_over_time": {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1m"
}
}
}
}When Full-Text Search Matters
Loki's biggest limitation: you can't grep through log content efficiently without labels. If you search for "NullPointerException" without a label filter, Loki scans all chunks (slow).
Elasticsearch/OpenSearch index every word — searching "NullPointerException" across 1 billion logs is fast.
Rule: If your team routinely searches log content (not just filters by service/pod), use Elasticsearch or OpenSearch. If you filter by labels first and then look at content, Loki is faster and cheaper.
Data Retention Strategies
# Loki: compactor handles retention
loki:
compactor:
retention_enabled: true
retention_delete_delay: 2h
retention_delete_worker_count: 150
limits_config:
retention_period: 30d
# Per-tenant override
per_tenant_override_config: /etc/loki/overrides.yaml# OpenSearch: ISM policy for retention
{
"policy": {
"description": "30-day log retention",
"states": [
{
"name": "hot",
"actions": [],
"transitions": [
{
"state_name": "delete",
"conditions": { "min_index_age": "30d" }
}
]
},
{
"name": "delete",
"actions": [{ "delete": {} }]
}
]
}
}Recommended Setup for Most Teams
Small teams / startups: Loki + Grafana. Low cost, integrates with Prometheus you already have.
Mid-size with existing ELK: Migrate from Elasticsearch to OpenSearch (API compatible, no license concerns). Use Fluent Bit as shipper.
Enterprise with compliance needs: OpenSearch with Security plugin (field-level security, audit logging, SAML/OIDC).
Hybrid: Use Loki for Kubernetes pod logs (high volume, label-based) + OpenSearch for application audit logs (need full-text search).
Loki wins on cost. OpenSearch/Elasticsearch win on query power. Match the tool to your actual query patterns, not hype.
Today I Fixed
Short real fixes from production — posted daily
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
AI-Powered Kubernetes Anomaly Detection: Beyond Static Thresholds
Static alerts miss 40% of real incidents. Learn how AI and ML-based anomaly detection — using tools like Prometheus + ML, Dynatrace, and custom LLM runbooks — catches what thresholds can't.
Build an AI-Powered SLO Breach Predictor with Claude and Prometheus
Build an SLO breach predictor that reads error budget burn rate from Prometheus, uses Claude to analyze patterns, and sends Slack alerts before SLOs breach — not after.
Build an AI Alert Classifier for Grafana Using LLMs (2026)
Tired of noisy Grafana alerts that wake you up for nothing? Build an AI layer that classifies incoming alerts as actionable or noise, enriches them with context, and routes them intelligently — using Claude or GPT-4 as the reasoning engine.