Monitoring Architecture
MATIH uses a Prometheus-based monitoring stack with Grafana for visualization, ServiceMonitor CRDs for automatic target discovery, and custom recording rules for SLO tracking. The monitoring infrastructure lives in the matih-monitoring namespace and collects metrics from all control plane and data plane services.
Components
| Component | Port | Description |
|---|---|---|
| Prometheus | 9090 | Metrics collection and storage |
| Grafana | 3000 | Dashboard visualization |
| Alertmanager | 9093 | Alert routing and notification |
| Node Exporter | 9100 | Host-level metrics |
| kube-state-metrics | 8080 | Kubernetes object metrics |
Subsections
| Page | Description |
|---|---|
| Prometheus Setup | Installation, configuration, and service discovery |
| Prometheus Rules | Recording rules and alerting rules |
| Grafana Dashboards | Pre-built dashboards for platform monitoring |
| Grafana Data Sources | Configuring Prometheus, Loki, and Tempo data sources |
| Custom Metrics | Application-level custom metrics instrumentation |
| SLO Monitoring | Service Level Objective tracking and error budgets |
Deployment
The monitoring stack is deployed using the kube-prometheus-stack Helm chart:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install monitoring prometheus-community/kube-prometheus-stack \
-f prometheus/values.yaml \
--namespace matih-monitoring \
--create-namespaceMetric Collection Flow
Services (metrics endpoint) --> ServiceMonitor CRD --> Prometheus --> Grafana
|
Alertmanager --> PagerDuty / Slack