MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Namespaces
Data Plane

Data Plane Namespace

The matih-data-plane namespace hosts all data processing, AI/ML, and analytics services alongside the data infrastructure components. This is the heaviest namespace in terms of resource consumption, with dedicated resource quotas and limit ranges.


Services Deployed

ServiceTechPortReplicasPurpose
query-engineJava80803SQL query routing and federation
catalog-serviceJava80862Data catalog (OpenMetadata integration)
pipeline-serviceJava80922ETL pipeline orchestration
semantic-layerJava80862Semantic model management
bi-serviceJava80842Dashboard and visualization
ai-servicePython80002NLP-to-SQL, conversational AI
ml-servicePython80001ML training and serving
data-quality-servicePython80002Data profiling and quality rules
data-plane-agentJava80851Control plane communication
render-serviceNode.js80982Chart/PDF rendering
ops-agent-servicePython80802Infrastructure optimization

Resource Quotas

The data plane enforces resource quotas to prevent runaway workloads:

# From matih-data-plane/templates/resource-quotas.yaml
resourceQuotas:
  enabled: true
  requests:
    cpu: "80"
    memory: "160Gi"
  limits:
    cpu: "160"
    memory: "320Gi"
  pods: 300
  storage:
    persistentVolumeClaims: 50
    requestsStorage: "500Gi"

Rendered Kubernetes resource:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: matih-data-plane-quota
  namespace: matih-data-plane
spec:
  hard:
    requests.cpu: "80"
    requests.memory: "160Gi"
    limits.cpu: "160"
    limits.memory: "320Gi"
    pods: "300"
    services: "100"
    secrets: "200"
    configmaps: "150"
    persistentvolumeclaims: "50"
    services.nodeports: "10"
    services.loadbalancers: "5"
    requests.storage: "500Gi"

Limit Ranges

Default resource limits prevent pods from consuming excessive resources:

# From matih-data-plane/values.yaml
limitRanges:
  enabled: true
  default:
    cpu: "500m"
    memory: "512Mi"
  defaultRequest:
    cpu: "100m"
    memory: "128Mi"
  max:
    cpu: "4"
    memory: "8Gi"

Data Infrastructure

The data plane namespace also hosts these infrastructure components:

ComponentTypePurpose
PostgreSQLStatefulSetRelational storage for all services
RedisStatefulSetCaching, sessions, pub/sub
Strimzi KafkaStrimzi CRDEvent streaming, domain events
TrinoDeploymentFederated SQL engine
MinIOStatefulSetS3-compatible object storage
QdrantStatefulSetVector database for AI embeddings
ClickHouseStatefulSetOLAP analytics
PolarisDeploymentIceberg REST catalog
OpenMetadataDeploymentData catalog and governance

Node Selection

All data plane workloads target dedicated nodes:

nodeSelector:
  agentpool: dataplane
 
tolerations:
  - key: "matih.ai/data-plane"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"

Global Configuration

Services share global configuration for database, Redis, Kafka, and storage:

global:
  namespace: matih-data-plane
  database:
    host: "postgresql.matih-data-plane.svc.cluster.local"
    port: 5432
  redis:
    host: "redis-master.matih-data-plane.svc.cluster.local"
    port: 6379
  kafka:
    bootstrapServers: "strimzi-kafka-kafka-bootstrap.matih-data-plane.svc.cluster.local:9093"
    securityProtocol: SSL
  trino:
    url: "jdbc:trino://trino.matih-data-plane.svc.cluster.local:8080"
  storage:
    type: "minio"
    s3:
      endpoint: "http://minio.matih-data-plane.svc.cluster.local:9000"