SLOs
The SLOController manages Service Level Objectives that define reliability targets for platform services. SLOs track error budgets, availability targets, and latency targets, providing a structured approach to reliability engineering.
ServiceLevelObjective Structure
| Field | Type | Description |
|---|---|---|
id | String | SLO identifier |
name | String | SLO name (e.g., "AI Service Availability") |
description | String | Description of the objective |
service | String | Target service name |
sliType | String | availability, latency, error_rate, throughput |
target | double | Target value (e.g., 99.9 for 99.9% availability) |
window | String | Evaluation window (7d, 28d, 30d) |
query | String | PromQL query for the SLI measurement |
alertOnBudgetBurn | boolean | Alert when error budget burn rate is high |
SLO Management
Create SLO
Endpoint: POST /api/v1/observability/slos
curl -X POST http://localhost:8088/api/v1/observability/slos \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${TOKEN}" \
-H "X-Tenant-ID: 550e8400" \
-d '{
"name": "AI Service Availability",
"service": "ai-service",
"sliType": "availability",
"target": 99.9,
"window": "30d",
"alertOnBudgetBurn": true
}'List SLOs
Endpoint: GET /api/v1/observability/slos
Get SLO
Endpoint: GET /api/v1/observability/slos/:sloId
Update SLO
Endpoint: PUT /api/v1/observability/slos/:sloId
Delete SLO
Endpoint: DELETE /api/v1/observability/slos/:sloId
SLO Status
Endpoint: GET /api/v1/observability/slos/:sloId/status
Returns the current status of an SLO including:
SLOStatus Structure
| Field | Type | Description |
|---|---|---|
sloId | String | SLO identifier |
currentValue | double | Current SLI value |
target | double | SLO target |
errorBudgetTotal | double | Total error budget for the window |
errorBudgetRemaining | double | Remaining error budget |
errorBudgetConsumed | double | Percentage of budget consumed |
burnRate | double | Current error budget burn rate |
status | String | healthy, warning, breaching |
timeWindowStart | Instant | Start of evaluation window |
timeWindowEnd | Instant | End of evaluation window |
Error Budget
The error budget represents the acceptable amount of unreliability within the SLO window:
Error Budget = 1 - SLO Target
Example: 99.9% target = 0.1% error budget = ~43 minutes per 30 daysBudget Burn Rate Alerts
When alertOnBudgetBurn is enabled, alerts fire based on how fast the error budget is being consumed:
| Burn Rate | Window | Severity | Description |
|---|---|---|---|
| 14.4x | 1 hour | Critical | Budget exhausted in 2 hours |
| 6x | 6 hours | Critical | Budget exhausted in 5 hours |
| 3x | 1 day | Warning | Budget exhausted in 10 days |
| 1x | 3 days | Info | Budget on track to be exhausted |
SLO Dashboard
Endpoint: GET /api/v1/observability/slos/dashboard
Returns all SLOs with their current status for a consolidated SLO dashboard view.