MATIH Platform is in active MVP development. Documentation reflects current implementation status.
8. Platform Services
Architecture

Observability API Architecture

The Observability API provides a unified interface for metrics, logs, traces, dashboards, alerting, SLOs, anomaly detection, capacity planning, cost allocation, custom metrics, and profiling across the MATIH platform. Running on port 8088, it aggregates data from Prometheus, Loki, and Jaeger into a tenant-scoped, multi-signal observability layer with reactive (WebFlux) endpoints.


Service Overview

PropertyValue
Service Nameobservability-api
Port8088
TechnologySpring Boot 3.2, Java 21, Spring WebFlux (reactive)
Metrics BackendPrometheus
Logs BackendLoki
Traces BackendJaeger
DatabasePostgreSQL (benchmarks, reports)
API DocumentationOpenAPI 3.0 (Swagger)

Controllers

ControllerBase PathPurpose
MetricsController/api/v1/observability/metricsPromQL queries, request/error rates, latencies
LogsController/api/v1/observability/logsLog querying and streaming
TracingController/api/v1/observability/tracesDistributed trace search and analysis
DashboardController/api/v1/observability/dashboardsDashboard templates and customization
AlertingController/api/v1/observability/alertsAlert rules and notification management
SLOController/api/v1/observability/slosService level objective management
AnomalyDetectionController/api/v1/observability/anomaliesML-based anomaly detection
CapacityPlanningController/api/v1/observability/capacityResource utilization and forecasting
CostAllocationController/api/v1/observability/costsObservability cost allocation
CustomMetricsController/api/v1/observability/custom-metricsCustom metric registration and queries
ProfilingController/api/v1/observability/profilingContinuous profiling
RealtimeDashboardController--WebSocket real-time dashboards
TopologyController--Service dependency mapping
HealthScoreController--Health scoring
HealthController/healthService health probes

Reactive Architecture

The Observability API uses Spring WebFlux for non-blocking, reactive request handling. Most endpoints return Mono or Flux types, allowing efficient concurrent querying of multiple backends (Prometheus, Loki, Jaeger) without thread blocking.


Next Steps