MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Overview

Data Infrastructure Overview

MATIH deploys 12+ data infrastructure components in the matih-data-plane namespace, providing SQL engines, streaming, storage, graph databases, and vector search. Each component is configured for production use with persistence, replication, monitoring, and security.


Component Summary

ComponentCategoryDeployment TypePersistence
TrinoSQL EngineDeployment (Coordinator + Workers)Spill disk
Kafka (Strimzi)StreamingStrimzi CRD (KRaft mode)10Gi per broker
PostgreSQLRDBMSStatefulSet (Primary + Replicas)50Gi
RedisCacheStatefulSet (Master + Sentinel)10Gi
MinIOObject StorageStatefulSetConfigurable
ClickHouseOLAPStatefulSetSSD
FlinkStream ProcessingDeployment (JM + TM)Checkpoints
SparkBatch/InteractiveDeployment + Spark ConnectNone (stateless)
DgraphGraph DatabaseStatefulSet (Alpha + Zero)10Gi
QdrantVector DatabaseStatefulSetSSD
Neo4jGraph DatabaseStatefulSet10Gi

Section Contents

PageDescription
TrinoFederated SQL engine with Polaris Iceberg catalog
Kafka / StrimziEvent streaming with KRaft mode and TLS
PostgreSQLRelational database with HA replication
RedisCaching, sessions, and pub/sub
MinIOS3-compatible object storage
ClickHouseOLAP columnar analytics
FlinkReal-time stream processing
SparkBatch and interactive computing
DgraphGraph database for ontologies
QdrantVector database for AI embeddings
Neo4jGraph database for lineage