Grafana
Grafana provides visualization dashboards for all MATIH metrics, logs, and traces with provisioned data sources and pre-built dashboards.
Data Sources
| Source | Type | Purpose |
|---|---|---|
| Prometheus (CP) | Prometheus | Control plane metrics |
| Prometheus (DP) | Prometheus | Data plane metrics |
| Loki | Loki | Log querying |
| Tempo | Tempo | Trace exploration |
| ClickHouse | ClickHouse | Business analytics |
Pre-Built Dashboards
| Dashboard | Metrics Source | Key Panels |
|---|---|---|
| Platform Overview | Prometheus | Service health, request rate, error rate |
| AI Service | Prometheus | Inference latency, LLM costs, token usage |
| Trino | Prometheus | Query throughput, memory usage, queue depth |
| Kafka | Prometheus | Broker metrics, consumer lag, topic throughput |
| PostgreSQL | Prometheus | Connection pool, query duration, replication lag |
| Node Resources | Prometheus | CPU, memory, disk per node pool |
| Cost Attribution | ClickHouse | Per-tenant cost breakdown |
Authentication
Grafana authenticates via the MATIH IAM service using OAuth2/OIDC. Admin credentials are stored in a Kubernetes secret:
# Password from External Secrets Operator
grafanaAdminPassword: "grafana-admin-password" # Key Vault reference