Alerting Overview
MATIH uses Alertmanager for alert routing, grouping, silencing, and notification delivery. Alerts are generated by Prometheus rules and routed to the appropriate channels based on severity, category, and team ownership. The alerting system integrates with PagerDuty for critical pages and Slack for warnings.
Subsections
| Page | Description |
|---|---|
| Alertmanager Setup | Installation, configuration, and routing |
| Alert Rules Library | Complete library of alerting rules |
| Notification Channels | PagerDuty, Slack, email, and webhook integration |
| Incident Response | Alert-driven incident response procedures |
Alert Flow
Prometheus Rules --> Alertmanager --> Routing --> Notification Channels
|
Grouping + Dedup
|
Silencing
|
InhibitionSeverity Levels
| Severity | Response Time | Channel | Description |
|---|---|---|---|
critical | Immediate | PagerDuty page | Service down, data loss risk |
warning | Within 1 hour | Slack channel | Degradation, approaching thresholds |
info | Next business day | Slack channel | Informational, capacity planning |