MATIH Platform is in active MVP development. Documentation reflects current implementation status.
2. Architecture
Overview
Part I: Platform Foundations

Chapter 2: Architecture Deep Dive

The MATIH Enterprise Platform is built on a two-plane architecture that cleanly separates platform management concerns from tenant workload execution. This chapter provides a comprehensive examination of every architectural layer, from high-level design philosophy down to individual service responsibilities, inter-service communication patterns, data store selection, and the decision rationale behind each structural choice.

Learning Objectives

  • Understand the two-plane architecture and why platform management is separated from tenant workloads
  • Map the complete service topology across 24 microservices and their dependencies
  • Trace request flows from browser through gateway, backend services, data stores, and back
  • Evaluate multi-tenancy isolation strategies at network, database, application, and event layers
  • Navigate the event-driven architecture including Kafka topology, CDC, and WebSocket patterns
  • Understand data store selection criteria across PostgreSQL, Redis, Kafka, Trino, and vector stores

Details

Estimated Read Time: 4-6 hours
Prerequisites:
  • Microservices architecture fundamentals
  • Kubernetes namespace and networking concepts
  • Event-driven architecture patterns
  • SQL and distributed query engines
Related Chapters:
  • Ch. 3: Security and Multi-Tenancy
  • Ch. 17: Kubernetes and Helm
  • Ch. 19: Observability
  • Ch. 18: CI/CD and Build System
Production - 24 microservices, 55+ Helm charts, 7 Kubernetes namespaces

What This Chapter Covers

This chapter is organized into ten sections, each addressing a distinct architectural dimension of the platform. The sections are designed to be read sequentially for a complete understanding, but each section is also self-contained for reference.

SectionFocus AreaPages
Design PhilosophyCore principles, trade-offs, constraints, and the reasoning behind key architectural choices1
Control PlaneAll 10 Java/Spring Boot 3.2 services that manage platform operations, with internal architecture details7
Data PlaneAll 14 polyglot services (Java, Python, Node.js) that execute tenant workloads7
Service TopologyService discovery, dependency graphs, communication patterns, and failure propagation4
Multi-Tenancy ArchitectureNamespace isolation, TenantContext propagation, per-tenant databases, network policies6
Event-Driven ArchitectureKafka event streaming, Redis Pub/Sub, WebSocket, CDC, and event schemas6
API GatewayKong 3.5.0 gateway, custom Lua plugins, routing and rate limiting1
Request Lifecycle and Data FlowEnd-to-end request tracing from browser to database and back across five key flows6
Data StoresPostgreSQL, Redis, Kafka, Trino, Qdrant, Neo4j, MinIO, ClickHouse architecture9
API DesignREST conventions, error handling, authentication patterns, rate limiting5
Architecture Decision RecordsADRs documenting the rationale for major architectural decisions1

Architecture at a Glance

The MATIH platform consists of 24 microservices distributed across 7 Kubernetes namespaces, communicating through a combination of synchronous REST APIs and asynchronous event streams. The platform processes natural language questions through a multi-agent AI pipeline that generates SQL, executes queries via Trino, and renders visualizations -- the "Intent to Insights" workflow.

                          +------------------+
                          |   Browser / CLI  |
                          +--------+---------+
                                   |
                          +--------v---------+
                          |  Kong API Gateway |
                          |   (Port 8080)     |
                          +--------+---------+
                                   |
                    +--------------+--------------+
                    |                             |
           +--------v---------+         +---------v--------+
           |  Control Plane   |         |   Data Plane     |
           |  (10 services)   |         |  (14 services)   |
           |  matih-control-  |         |  matih-data-     |
           |  plane namespace |         |  plane namespace  |
           +--------+---------+         +---------+--------+
                    |                             |
           +--------v---------+         +---------v--------+
           | PostgreSQL, Redis |         | PostgreSQL, Redis|
           | Kafka, ES         |         | Trino, Kafka     |
           +-------------------+         | Qdrant, Neo4j    |
                                         | ClickHouse, MinIO|
                                         +------------------+
Frontend Layer
bi-workbench:3000ml-workbench:3001data-workbench:3002agentic-workbench:3003control-plane-ui:3004data-plane-ui:3005onboarding-uiops-workbench
API Gateway
Kong 3.5.0:8080JWT PluginRate Limit PluginValidation Plugin
Control Plane (Java/Spring Boot 3.2)
iam-service:8081tenant-service:8082config-service:8888notification-service:8085audit-service:8086billing-service:8087observability-api:8088infrastructure-service:8089api-gateway:8080platform-registry:8084
Data Plane (Polyglot)
query-engine:8080catalog-service:8086semantic-layer:8086bi-service:8084pipeline-service:8092ai-service:8000ml-service:8000data-quality-service:8000render-service:8098data-plane-agent:8085ontology-service:8101governance-service:8080ops-agent-service:8080
Data Infrastructure
PostgreSQL 16Redis 7Kafka (Strimzi)TrinoClickHouseQdrantNeo4j/DgraphMinIOElasticsearch
ML/AI Infrastructure
MLflowRayvLLMTritonFeastJupyterHubFlinkSpark

Key Numbers

MetricValue
Total microservices24
Control Plane services10 (all Java/Spring Boot 3.2)
Data Plane services14 (Java, Python, Node.js)
Frontend applications8 (React/Vite)
Kubernetes namespaces7
Helm charts55+
Kafka topics20+ event categories
Commons libraries4 (Java, Python, TypeScript, AI)
Data stores9 distinct technologies

Two-Plane Architecture

The platform is divided into two distinct operational planes, each with its own deployment model, scaling strategy, and failure domain.

Control Plane

The Control Plane manages platform-level concerns that are shared across all tenants. It handles identity and access management, tenant provisioning, configuration distribution, billing, auditing, and infrastructure orchestration. All 10 Control Plane services are built with Java 21 and Spring Boot 3.2, deployed in the matih-control-plane namespace.

The Control Plane is tenant-aware but not tenant-specific -- it operates on metadata about tenants rather than on tenant data itself. When the Control Plane writes to its database, it stores tenant configuration, user profiles, and billing records -- never customer business data.

Data Plane

The Data Plane executes tenant-specific workloads including query execution, AI/ML inference, data pipeline orchestration, dashboard rendering, and data governance. Its 14 services span three technology stacks: Java/Spring Boot for data-intensive services, Python/FastAPI for AI/ML workloads, and Node.js for rendering. Data Plane services are deployed into per-tenant namespaces, providing namespace-level isolation.

The Data Plane processes, transforms, and analyzes actual customer data. Every query, every AI conversation, every ML training job runs within the tenant's Data Plane namespace, isolated from other tenants by Kubernetes NetworkPolicies and ResourceQuotas.


Namespace Organization

The platform uses seven Kubernetes namespaces to enforce logical and security boundaries:

NamespacePurposeServices
matih-systemCore platform infrastructureOperators, CRDs, shared controllers, Strimzi, cert-manager
matih-control-planePlatform management servicesAll 10 Control Plane services
matih-data-planeDefault tenant workload servicesAll 14 Data Plane services (per-tenant namespaces in production)
matih-observabilityMonitoring and tracingPrometheus, Grafana, Tempo, Loki
matih-monitoring-control-planeControl Plane monitoringService-specific monitors and alerts
matih-monitoring-data-planeData Plane monitoringService-specific monitors and alerts
matih-frontendFrontend applicationsReact workbench applications

In production, each tenant receives a dedicated namespace: matih-data-plane-{tenant-slug}. This namespace contains the tenant's Data Plane services, secrets, and resource quotas.


Commons Libraries

Shared functionality is extracted into four commons libraries that enforce consistency across service boundaries:

LibraryLanguageKey Modules
commons-javaJavaSecurity (JWT, RBAC), multi-tenancy (TenantContext), persistence, caching, observability, event streaming
commons-pythonPythonAuthentication middleware, tenant context, structured logging, health checks
commons-typescriptTypeScriptAPI client utilities, authentication hooks, shared UI components
commons-aiPythonLLM abstractions, prompt management, RAG utilities, agent framework

The commons-java library alone provides over 100 classes spanning API versioning, billing context, cache management, CDN integration, circuit breakers, database optimization, event streaming, exception handling, Kafka messaging, observability, and security -- all designed with multi-tenancy as a first-class concern.


Technology Stack Summary

Backend Technologies

LayerTechnologyVersion
Control PlaneJava 21 + Spring Boot 3.2LTS
Data Plane (Java services)Java 21 + Spring Boot 3.2LTS
Data Plane (AI/ML services)Python 3.11 + FastAPILatest
Data Plane (Rendering)Node.js 20 + ExpressLTS

Data Infrastructure

ComponentTechnologyPurpose
Primary databasePostgreSQL 16Transactional data, metadata, tenant schemas
Caching and sessionsRedis 7Session store, pub/sub, rate limiting
Event streamingKafka (Strimzi)Asynchronous communication, event sourcing
Federated SQLTrinoDistributed query execution across data sources
Full-text searchElasticsearch 8.11Audit log search, ontology search
OLAP analyticsClickHouse / StarRocksFast analytical queries on large datasets
Vector embeddingsQdrant / LanceDBRAG embeddings, semantic search
Knowledge graphsNeo4j / DgraphContext graphs, data lineage
Object storageMinIOS3-compatible artifact storage

ML/AI Infrastructure

ComponentTechnologyPurpose
Experiment trackingMLflowModel versioning, metrics, artifacts
Distributed computeRayModel training, hyperparameter tuning
LLM inferencevLLMHigh-throughput LLM serving
Model servingTritonGPU-optimized inference server
Feature storeFeastFeature engineering and serving
NotebooksJupyterHubInteractive development environment
Stream processingApache FlinkReal-time data transformations
Batch processingApache SparkLarge-scale data processing

How to Read This Chapter

For architects and tech leads, start with Design Philosophy to understand the reasoning behind structural choices, then proceed to Service Topology for the interaction map, and Data Stores for storage architecture.

For backend developers, begin with Control Plane or Data Plane depending on which services you work with, then read Multi-Tenancy Architecture to understand context propagation and API Design for REST conventions.

For platform engineers, focus on API Gateway, Event-Driven Architecture, Data Stores, and Request Lifecycle to understand the infrastructure layer.

For anyone evaluating the platform, the Architecture Decision Records section provides the rationale behind every major technical decision, and Design Philosophy explains the trade-offs.


Related Chapters