MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Namespace Topology (Legacy)

Namespace Topology

The MATIH platform organizes its Kubernetes resources across seven dedicated namespaces, each serving a distinct operational purpose. This isolation strategy provides security boundaries, resource quota enforcement, network segmentation, and clear ownership of workloads. This section details each namespace, its purpose, the resources it contains, RBAC policies, resource quotas, and inter-namespace communication patterns.


Namespace Overview

+---------------------------------------------------------------+
|                    MATIH Kubernetes Cluster                    |
|                                                                |
|  +-- matih-system -------------------------+                  |
|  | Cluster-wide infrastructure:            |                  |
|  | cert-manager, external-dns, ESO,        |                  |
|  | Strimzi operator, KEDA                  |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-control-plane ------------------+                  |
|  | IAM, Tenant, Config, Audit,             |                  |
|  | Notification, Billing, API Gateway,     |                  |
|  | Platform Registry, Infrastructure Svc   |                  |
|  | PostgreSQL (shared), Redis, Kafka        |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-data-plane ---------------------+                  |
|  | AI Service, BI Service, ML Service,     |                  |
|  | Query Engine, Catalog, Pipeline,        |                  |
|  | Semantic Layer, Data Plane Agent,       |                  |
|  | Data Quality, Render, Ops Agent,        |                  |
|  | Ontology, Governance                    |                  |
|  | + Trino, Kafka, PostgreSQL, Redis,      |                  |
|  |   Qdrant, Neo4j, Dgraph, StarRocks,    |                  |
|  |   Elasticsearch, MongoDB, ChromaDB     |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-observability ------------------+                  |
|  | Prometheus, Grafana, Loki, Promtail,    |                  |
|  | Tempo, Observability API                |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-monitoring-control-plane -------+                  |
|  | ServiceMonitors for control plane       |                  |
|  | services, alert rules                   |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-monitoring-data-plane ----------+                  |
|  | ServiceMonitors for data plane          |                  |
|  | services, alert rules                   |                  |
|  +-----------------------------------------+                  |
|                                                                |
|  +-- matih-frontend -----------------------+                  |
|  | BI Workbench, ML Workbench,             |                  |
|  | Data Workbench, Agentic Workbench,      |                  |
|  | Control Plane UI, Data Plane UI         |                  |
|  +-----------------------------------------+                  |
+---------------------------------------------------------------+

Namespace Labels

Every MATIH namespace carries standard labels that are referenced by NetworkPolicies, RBAC policies, and monitoring configurations:

# Standard namespace labels
apiVersion: v1
kind: Namespace
metadata:
  name: matih-data-plane
  labels:
    name: matih-data-plane
    app.kubernetes.io/part-of: matih-platform
    matih.ai/tier: data-plane
    matih.ai/environment: production
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

The name label is particularly important because it is referenced in NetworkPolicy namespaceSelector rules to control inter-namespace traffic.

LabelPurpose
nameNamespace identification for NetworkPolicy selectors
app.kubernetes.io/part-ofPlatform grouping
matih.ai/tierArchitectural tier (system, control-plane, data-plane, observability, frontend)
matih.ai/environmentDeployment environment (dev, staging, production)
pod-security.kubernetes.io/*Pod Security Standards enforcement level

Detailed Namespace Descriptions

1. matih-system

Purpose: Cluster-wide infrastructure components and operators that serve multiple namespaces.

Resources:

ResourceTypeDescription
cert-managerDeploymentTLS certificate provisioning via Let's Encrypt
external-dnsDeploymentAutomatic DNS record management
External Secrets OperatorDeploymentSecret synchronization from cloud vault
Strimzi Kafka OperatorDeploymentKafka cluster lifecycle management
KEDADeploymentEvent-driven pod autoscaling
matih-operatorDeploymentCustom platform operator for tenant lifecycle
apiVersion: v1
kind: Namespace
metadata:
  name: matih-system
  labels:
    name: matih-system
    app.kubernetes.io/part-of: matih-platform
    matih.ai/tier: system

2. matih-control-plane

Purpose: All control plane microservices that manage the multi-tenant platform: identity, tenant lifecycle, configuration, auditing, and notifications.

Services Deployed:

ServicePortTechnologyReplicas
iam-service8081Java/Spring Boot2
tenant-service8082Java/Spring Boot2
config-service8888Java/Spring Boot2
notification-service8085Java/Spring Boot2
audit-service8086Java/Spring Boot2
billing-service8087Java/Spring Boot2
observability-api8088Java/Spring Boot2
infrastructure-service8089Java/Spring Boot2
api-gateway8080Java/Spring Boot2
platform-registry8084Java/Spring Boot2

Shared Infrastructure (within namespace):

ComponentPurposeNotes
PostgreSQLRelational storageBitnami chart, shared by CP services
RedisCaching, sessionsBitnami chart, shared by CP services
KafkaEvent messagingBitnami chart for CP-internal events
apiVersion: v1
kind: Namespace
metadata:
  name: matih-control-plane
  labels:
    name: matih-control-plane
    app.kubernetes.io/part-of: matih-platform
    matih.ai/tier: control-plane

3. matih-data-plane

Purpose: All data plane microservices and data infrastructure components that power analytics, AI, ML, and data processing.

Application Services:

ServicePortTechnologyReplicas
ai-service8000Python/FastAPI2
bi-service8084Java/Spring Boot2
ml-service8000Python/FastAPI2
query-engine8080Java/Spring Boot2
catalog-service8086Java/Spring Boot2
pipeline-service8092Java/Spring Boot2
semantic-layer8086Java/Spring Boot2
data-plane-agent8085Java/Spring Boot2
data-quality-service8000Python/FastAPI2
render-service8098Node.js2
ontology-service8101Python/FastAPI2
governance-service8080Python/FastAPI2
ops-agent-service8080Python/FastAPI2

Data Infrastructure:

ComponentPortsTechnologyDeployment Type
Trino8080Distributed SQLStatefulSet (coordinator + workers)
Kafka (Strimzi)9092, 9093Event streamingStrimzi CRD (KafkaCluster)
PostgreSQL5432Relational DBStatefulSet (Bitnami)
Redis6379Cache/messagingStatefulSet (Bitnami)
Qdrant6333, 6334Vector searchStatefulSet
Neo4j7474, 7687Graph databaseStatefulSet
Dgraph8080, 9080Graph databaseStatefulSet
StarRocks9030, 8030OLAP databaseStatefulSet
Elasticsearch9200, 9300Full-text searchStatefulSet
MongoDB27017Document storeStatefulSet
ChromaDB8000Vector embeddingsDeployment

4. matih-observability

Purpose: Centralized observability stack providing metrics, logging, tracing, and alerting.

ComponentPortPurpose
Prometheus Server9090Metrics collection and storage
Alertmanager9093Alert routing and deduplication
Grafana3000Dashboard visualization
Loki3100Log aggregation
PromtailN/ALog collection (DaemonSet)
Tempo3200Distributed trace storage
Observability API8086Unified observability query API

5. matih-monitoring-control-plane

Purpose: Dedicated monitoring resources for control plane services. This namespace hosts ServiceMonitor CRDs that instruct Prometheus to scrape metrics from control plane pods.

Resource TypeCountTarget
ServiceMonitor10One per control plane service
PrometheusRule5+Alert rules for CP services

Separating monitoring resources from the service namespace allows platform engineers to manage monitoring without granting access to application resources.

6. matih-monitoring-data-plane

Purpose: Dedicated monitoring resources for data plane services, following the same pattern as the control plane monitoring namespace.

Resource TypeCountTarget
ServiceMonitor14+One per data plane service
PrometheusRule10+Alert rules for DP services
PodMonitor3+Data infrastructure pods

7. matih-frontend

Purpose: Frontend applications served as static files behind NGINX reverse proxies.

ApplicationPortTechnology
bi-workbench3000React/Vite
ml-workbench3001React/Vite
data-workbench3002React/Vite
agentic-workbench3003React/Vite
control-plane-ui3004React/Vite
data-plane-ui3005React/Vite

RBAC Configuration

MATIH implements fine-grained RBAC using Kubernetes Roles, ClusterRoles, RoleBindings, and ClusterRoleBindings.

RBAC Hierarchy

ClusterRoles (platform-wide)
  |
  +-- matih-platform-admin      Full access to all namespaces
  +-- matih-platform-viewer     Read-only access to all namespaces
  +-- matih-namespace-admin     Admin within a specific namespace
  |
Roles (namespace-scoped)
  |
  +-- matih-service-deployer    Deploy/update services in namespace
  +-- matih-service-viewer      Read pods, services, configmaps
  +-- matih-secret-reader       Read secrets (for operators only)

ClusterRole Definitions

# Platform administrator - full access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: matih-platform-admin
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["*"]
 
---
# Platform viewer - read-only everywhere
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: matih-platform-viewer
rules:
  - apiGroups: ["", "apps", "batch", "networking.k8s.io"]
    resources: ["pods", "services", "deployments", "statefulsets",
                "jobs", "configmaps", "ingresses", "networkpolicies"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["monitoring.coreos.com"]
    resources: ["servicemonitors", "prometheusrules"]
    verbs: ["get", "list", "watch"]

Namespace-Scoped Roles

# Service deployer - can deploy and update services in a namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: matih-service-deployer
  namespace: matih-data-plane
rules:
  - apiGroups: ["apps"]
    resources: ["deployments", "statefulsets"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: [""]
    resources: ["services", "configmaps", "serviceaccounts"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: [""]
    resources: ["pods", "pods/log"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: ["networking.k8s.io"]
    resources: ["networkpolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

Service Account Bindings

Each service has a dedicated Kubernetes service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: ai-service
  namespace: matih-data-plane
  labels:
    app.kubernetes.io/name: ai-service
    app.kubernetes.io/part-of: matih-platform
  annotations:
    # Azure Workload Identity
    azure.workload.identity/client-id: "<managed-identity-client-id>"

Resource Quotas

Each namespace has resource quotas to prevent runaway resource consumption:

Control Plane Quotas

apiVersion: v1
kind: ResourceQuota
metadata:
  name: matih-control-plane-quota
  namespace: matih-control-plane
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "100"
    services: "20"
    persistentvolumeclaims: "20"
    secrets: "50"
    configmaps: "50"

Data Plane Quotas

apiVersion: v1
kind: ResourceQuota
metadata:
  name: matih-data-plane-quota
  namespace: matih-data-plane
spec:
  hard:
    requests.cpu: "80"
    requests.memory: 160Gi
    limits.cpu: "160"
    limits.memory: 320Gi
    pods: "300"
    services: "40"
    persistentvolumeclaims: "50"
    secrets: "100"
    configmaps: "100"
    # GPU quotas
    requests.nvidia.com/gpu: "4"

Quota Summary by Namespace

NamespaceCPU RequestsCPU LimitsMemory RequestsMemory LimitsMax Pods
matih-system102020Gi40Gi50
matih-control-plane204040Gi80Gi100
matih-data-plane80160160Gi320Gi300
matih-observability204040Gi80Gi50
matih-monitoring-*51010Gi20Gi30
matih-frontend102010Gi20Gi50

LimitRange Defaults

LimitRanges ensure every container has sensible default resource settings:

apiVersion: v1
kind: LimitRange
metadata:
  name: matih-default-limits
  namespace: matih-data-plane
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: "8"
        memory: 32Gi
      min:
        cpu: 50m
        memory: 64Mi
    - type: PersistentVolumeClaim
      max:
        storage: 500Gi
      min:
        storage: 1Gi

Inter-Namespace Communication

Services communicate across namespace boundaries using Kubernetes DNS FQDN syntax:

<service-name>.<namespace>.svc.cluster.local:<port>

Communication Matrix

Source NamespaceTarget NamespaceServicesPortsPurpose
matih-data-planematih-control-planeiam-service8081JWT validation, user lookup
matih-data-planematih-data-planequery-engine, catalog-service, semantic-layer8080, 8086Query execution, metadata
matih-control-planematih-data-planedata-plane-agent8085Tenant provisioning
matih-frontendmatih-control-planeapi-gateway8080API requests
matih-observabilitymatih-control-planeAll servicesVariousPrometheus scraping
matih-observabilitymatih-data-planeAll servicesVariousPrometheus scraping
matih-data-planeExternalLLM APIs443OpenAI, Anthropic, Azure

Cross-Namespace Service Discovery Example

In the AI service values.yaml, cross-namespace references use FQDNs:

config:
  services:
    # Same namespace (matih-data-plane)
    queryEngineUrl: "http://query-engine.matih-data-plane.svc.cluster.local:8080"
    semanticLayerUrl: "http://semantic-layer.matih-data-plane.svc.cluster.local:8086"
    catalogServiceUrl: "http://catalog-service.matih-data-plane.svc.cluster.local:8086"
    # Cross-namespace (matih-control-plane)
    iamServiceUrl: "http://iam-service.matih-control-plane.svc.cluster.local:8081"

Deep Dive: Why FQDNs instead of short names? While services within the same namespace can use short names (e.g., query-engine:8080), MATIH always uses the fully qualified domain name for clarity and to avoid DNS resolution ambiguity. This is especially important when services reference resources across namespaces, as short names would resolve within the source namespace and fail.


Pod Security Standards

MATIH enforces the Kubernetes Pod Security Standards at the namespace level:

NamespaceEnforce LevelAudit LevelWarn Level
matih-systembaselinerestrictedrestricted
matih-control-planerestrictedrestrictedrestricted
matih-data-planerestrictedrestrictedrestricted
matih-observabilitybaselinerestrictedrestricted
matih-monitoring-*restrictedrestrictedrestricted
matih-frontendrestrictedrestrictedrestricted

The restricted profile enforces:

  • Containers must run as non-root
  • Containers must not use privilege escalation
  • Containers must drop all capabilities
  • Containers must use a read-only root filesystem (with exceptions via emptyDir)
  • Seccomp profile must be RuntimeDefault or Localhost

The matih-system and matih-observability namespaces use baseline enforcement because some infrastructure components (cert-manager, Promtail DaemonSet) require capabilities that restricted does not permit.


Namespace Creation Script

Namespaces are created as part of the CD pipeline (stage 01-namespaces):

#!/usr/bin/env bash
# scripts/cd/stages/01-namespaces.sh
 
NAMESPACES=(
  "matih-system"
  "matih-control-plane"
  "matih-data-plane"
  "matih-observability"
  "matih-monitoring-control-plane"
  "matih-monitoring-data-plane"
  "matih-frontend"
)
 
for ns in "${NAMESPACES[@]}"; do
  # Apply namespace with labels
  kubectl apply -f "infrastructure/k8s/namespaces/${ns}.yaml"
 
  # Apply resource quotas
  kubectl apply -f "infrastructure/k8s/quotas/${ns}-quota.yaml"
 
  # Apply limit ranges
  kubectl apply -f "infrastructure/k8s/limits/${ns}-limits.yaml"
done

Troubleshooting

Common Namespace Issues

IssueSymptomResolution
Namespace stuck in Terminatingkubectl get ns shows TerminatingCheck for finalizers; remove if resources are cleaned up
Quota exceededPods stuck in Pending with "exceeded quota" eventIncrease quota or reduce resource requests
Cross-namespace DNS failureService calls return connection refusedVerify FQDN format; check NetworkPolicy allows egress to target namespace
RBAC permission denied403 errors in pod logsCheck RoleBinding; verify service account has correct Role
Pod Security violationPod rejected with "violates PodSecurity"Adjust securityContext to meet restricted profile

Next Steps