Monitoring Stack Overview
MATIH deploys a comprehensive observability stack across dedicated monitoring namespaces, providing metrics collection, log aggregation, distributed tracing, and alerting.
Stack Components
| Component | Purpose | Namespace |
|---|---|---|
| Prometheus | Metrics collection and storage | matih-monitoring-* |
| Grafana | Visualization and dashboards | matih-observability |
| Alertmanager | Alert routing and notification | matih-monitoring-* |
| Loki | Log aggregation and querying | matih-observability |
| Tempo | Distributed trace storage | matih-observability |
| OTEL Collector | Telemetry pipeline | matih-observability |
Architecture
+------------------+ +------------------+ +------------------+
| Services | | Prometheus | | Grafana |
| (metrics, logs, |---->| (ServiceMonitor) |---->| (Dashboards) |
| traces) | +------------------+ +------------------+
+------------------+ |
| v
| +------------------+
| | Alertmanager |
| | (Routing) |
| +------------------+
|
+------------> +------------------+
| | Loki |
| | (Logs via |
| | Fluent-bit) |
| +------------------+
|
+------------> +------------------+
| Tempo |
| (Traces via |
| OTEL Collector) |
+------------------+Section Contents
| Page | Description |
|---|---|
| Prometheus | Metrics scraping, recording rules, storage |
| Grafana | Dashboards, data sources, provisioning |
| Alertmanager | Alert routing, receivers, inhibition |
| Loki | Log aggregation and LogQL querying |