SLIs, SLOs, SLAs & Error Budget Calculations
Welcome to the foundational lesson on Site Reliability Engineering (SRE) metrics.
Service Level Indicators (SLIs)
A carefully defined quantitative measure of some aspect of the level of service that is provided. Examples include request latency, error rate, and system throughput.
Service Level Objectives (SLOs)
A target value or range of values for a service level that is measured by an SLI. SLOs should act as a threshold to determine if users are happy.
Service Level Agreements (SLAs)
An explicit or implicit contract with your users that includes consequences of meeting (or missing) the SLOs they contain.
Error Budgets
The mathematical difference between 100% reliability and your SLO. If your SLO is 99.9%, you have a 0.1% error budget. When the error budget is depleted, feature freezes or reliability-focused sprints are triggered.