MATIH Platform is in active MVP development. Documentation reflects current implementation status.
2. Architecture
Overview

Data Plane Architecture

Production - 14 polyglot services - Java, Python, Node.js

The Data Plane is the execution engine of the MATIH platform. It consists of 14 polyglot services that process tenant-specific workloads -- executing queries, orchestrating AI agents, training ML models, managing data pipelines, rendering visualizations, and enforcing data governance. Unlike the homogeneous Control Plane, the Data Plane deliberately uses a polyglot architecture to match each service's technology to its problem domain.


2.4.1Technology Distribution

StackCountServicesRationale
Java / Spring Boot 3.26query-engine, catalog-service, semantic-layer, bi-service, pipeline-service, data-plane-agentJDBC, Hibernate multi-tenancy, Trino integration
Python / FastAPI7ai-service, ml-service, data-quality-service, ontology-service, governance-service, ops-agent-service, auth-proxyLangChain, PyTorch, pandas, LLM libraries
Node.js / Express1render-servicePuppeteer/Playwright for chart rendering

2.4.2Complete Service Registry

ServiceStackPortDatabaseKey Dependencies
query-engineJava8080queryTrino, PostgreSQL, Redis
catalog-serviceJava8086catalogPostgreSQL, OpenMetadata
semantic-layerJava8086semanticPostgreSQL, Redis, Trino
bi-serviceJava8084biPostgreSQL, Redis, semantic-layer
pipeline-serviceJava8092pipelinePostgreSQL, Kafka, Temporal
data-plane-agentJava8085noneRedis, Kafka
ai-servicePython8000aiPostgreSQL, Redis, Qdrant, vLLM, Dgraph
ml-servicePython8000mlPostgreSQL, Redis, Ray, MLflow
data-quality-servicePython8000qualityPostgreSQL, Trino
ontology-servicePython8101ontologyPostgreSQL, Elasticsearch
governance-servicePython8080governancePostgreSQL, OpenMetadata, Polaris
ops-agent-servicePython8080ops_agentPostgreSQL, Redis, Kafka, Prometheus, ChromaDB
auth-proxyPython5000noneIAM service
render-serviceNode.js8098noneRedis

2.4.3Resource Allocation

ServiceCPU RequestCPU LimitMemory RequestMemory Limit
query-engine200m1000m512Mi1Gi
catalog-service100m500m256Mi512Mi
semantic-layer200m1000m512Mi1Gi
bi-service100m500m256Mi512Mi
ai-service200m1000m512Mi2Gi
ml-service200m1000m512Mi2Gi
pipeline-service100m500m256Mi512Mi
data-quality-service100m500m256Mi512Mi
data-plane-agent100m500m256Mi512Mi
render-service100m500m256Mi512Mi
ontology-service100m500m256Mi512Mi
governance-service100m500m256Mi512Mi
ops-agent-service500m2000m1Gi4Gi

The ops-agent-service requires the most resources due to its AI workloads and observability data processing. The ai-service and ml-service also have elevated memory limits for LLM context windows and model loading.


2.4.4Health Check Patterns

Data Plane services use different health check patterns based on their technology stack:

StackHealth PathFrameworkKubernetes Probe
Java / Spring Boot/api/v1/actuator/healthSpring Boot ActuatorhttpGet
Python / FastAPI/healthCustom endpointhttpGet
Node.js/healthCustom endpointhttpGet

Java services provide deep health checks via Spring Actuator that verify database connectivity, Redis connectivity, and Kafka broker availability. Python services implement simpler health endpoints that verify the application is running and can handle requests.


2.4.5Cross-Service Dependencies

The Data Plane services form an analytical pipeline where services chain together:

                              +-------------+
                              |  ai-service |
                              +------+------+
                                     |
                    +----------------+----------------+
                    |                |                |
           +--------v------+  +-----v-------+  +----v----------+
           | query-engine  |  |semantic-layer|  |catalog-service|
           +--------+------+  +-----+-------+  +----+----------+
                    |                |                |
                    +--------+-------+                |
                             |                        |
                    +--------v------+          +------v----------+
                    |     Trino     |          | ontology-service|
                    +---------------+          +-----------------+

The core "Intent to Insights" flow traverses: ai-service --> catalog-service (schema context) + semantic-layer (metric definitions) --> query-engine --> Trino (execution) --> back to ai-service (analysis) --> optionally render-service (visualization).


Sub-Pages