cyberneticlibrary

Design SLOs and error budgets

slo-architectpluginsetup L217,464
alirezarezvani/claude-skills
What it does

Design SLOs with error budgets and burn-rate alerts per Google SRE

Best for

Structuring service reliability goals with defensible error budgets instead of ad-hoc alerting or targets disconnected from business risk

Inputs
  • · service
  • · SLI definition
  • · error budget window
Outputs
  • · SLO YAML
  • · error budget policy
  • · PromQL alert thresholds
  • · SLO review audit
Requires
  • · SLO designer
  • · error-budget calculator (PromQL-shaped)
  • · SLO reviewer (catches 7 common bugs)
  • · references on SLI design + error budget math + composition
Preconditions

Service has measurable SLI (latency, availability, etc.); monitoring/alerting system (Prometheus/Datadog/New Relic) running

Failure modes

SLO target too high renders error budget meaningless; window too short causes alert noise; no SLI definition creates circular reasoning; reviewer misses composition hazards with feature flags

Trust signals
  • · Per Google SRE Workbook discipline
  • · Error-budget calculator with multi-window burn-rate thresholds
  • · 7-bug review checklist (target too high, window too short, no SLI, CPU-as-SLI, etc.)
  • · 4 references on principles + SLI design + error budget math + composition
  • · SLO YAML + policy templates