MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Monitoring Stack
Loki

Loki

Loki provides log aggregation for all MATIH services, collecting structured JSON logs via Fluent-bit/Promtail and making them queryable via LogQL.


Architecture

+------------------+     +------------------+     +------------------+
| Service Pods     |     | Fluent-bit       |     | Loki             |
| (JSON logs)      |---->| (DaemonSet)      |---->| (Log storage)    |
+------------------+     +------------------+     +------------------+
                                                         |
                                                         v
                                                  +------------------+
                                                  | Grafana          |
                                                  | (LogQL queries)  |
                                                  +------------------+

Log Format

All MATIH services emit structured JSON logs:

{
  "timestamp": "2026-02-12T10:30:00Z",
  "level": "INFO",
  "service": "ai-service",
  "tenant_id": "tenant-acme",
  "trace_id": "abc123def456",
  "message": "Query completed successfully",
  "duration_ms": 234,
  "user_id": "user-001"
}

LogQL Examples

# All errors from AI service in the last hour
{namespace="matih-data-plane", app="ai-service"} |= "ERROR"

# JSON parsing with field extraction
{namespace="matih-data-plane"} | json | level="ERROR" | duration_ms > 5000

# Count errors per service
sum by (app) (count_over_time(
  {namespace="matih-data-plane"} |= "ERROR" [5m]
))

Retention

TierRetentionStorage
Hot7 daysSSD
Warm30 daysHDD
Cold90 daysS3/MinIO