MATIH Platform is in active MVP development. Documentation reflects current implementation status.
19. Observability & Operations
Alerting Overview

Alerting Overview

MATIH uses Alertmanager for alert routing, grouping, silencing, and notification delivery. Alerts are generated by Prometheus rules and routed to the appropriate channels based on severity, category, and team ownership. The alerting system integrates with PagerDuty for critical pages and Slack for warnings.


Subsections

PageDescription
Alertmanager SetupInstallation, configuration, and routing
Alert Rules LibraryComplete library of alerting rules
Notification ChannelsPagerDuty, Slack, email, and webhook integration
Incident ResponseAlert-driven incident response procedures

Alert Flow

Prometheus Rules --> Alertmanager --> Routing --> Notification Channels
                        |
                    Grouping + Dedup
                        |
                    Silencing
                        |
                    Inhibition

Severity Levels

SeverityResponse TimeChannelDescription
criticalImmediatePagerDuty pageService down, data loss risk
warningWithin 1 hourSlack channelDegradation, approaching thresholds
infoNext business daySlack channelInformational, capacity planning