What is SLA (Service Level Agreement)?
A formal contract between a provider and customer defining expected service levels.
An SLA is a formal agreement between a service provider and a customer that defines the expected level of service — typically availability, response time, and support response times. Breaching an SLA usually has financial consequences (service credits, penalties). SLAs are typically set at a lower target than your internal SLO to give you a buffer. For example, you might have an internal SLO of 99.95% availability but an SLA commitment to customers of 99.9%.
More Monitoring Terms
AlertManager
Prometheus component that handles alert routing, grouping, and notification delivery.
Error Budget
The acceptable amount of downtime or errors before an SLO is breached.
Grafana
An open-source analytics and visualization platform for metrics, logs, and traces.
Loki
Grafana's horizontally scalable log aggregation system inspired by Prometheus.
Observability
The ability to understand the internal state of a system from its external outputs.
OpenTelemetry
An open-source observability framework for generating metrics, logs, and traces.
Test your knowledge of SLA (Service Level Agreement) and 130 other DevOps concepts