Skip to content

SLI / SLO / SLA

Key idea:

SLI — a measured metric (e.g. "response time p99"). SLO — a target for the SLI (e.g. "p99 < 200ms"). SLA — a contractual commitment to customers (e.g. "99.9% uptime, otherwise refund"). Google's SRE book popularized this hierarchy. Typical: SLO = 99.9% monthly → error budget = 43 min/month. When budget is spent — pause feature work and focus on reliability.

Below: details, example, related terms, FAQ.

Details

  • SLI: quantitative metric (uptime %, p99 latency, error rate)
  • SLO: target value for the SLI — internal goal
  • SLA: customer-facing contract, usually with penalty on breach
  • Error budget: 100% - SLO (e.g. 99.9% SLO = 0.1% budget = 43 min/month)
  • Multi-dimensional: separate SLOs for availability, latency, error-rate

Example

SLI: % requests with status 2xx\/3xx
SLO: 99.9% of requests succeed (monthly)
SLA: refund 10% if <99.9% in a month

Related Terms

Learn more

Frequently Asked Questions

How to measure uptime SLI?

Synthetic probes (Enterno monitors) every minute. 30-day window. Success = HTTP 2xx/3xx + response time < threshold.

Do I need an SLA for a small team?

Internal SLO — always. SLA — only if a customer requires it (enterprise, compliance).

What to do when error budget is exhausted?

Feature-freeze, postmortem, reliability work until budget recovers. This is the main value of the error-budget approach.