Chapter 17: Kubernetes and Helm Infrastructure

The MATIH Enterprise Platform runs on Kubernetes as a cloud-agnostic, production-grade deployment target. With over 55 Helm charts, seven dedicated namespaces, and a layered architecture spanning control plane, data plane, observability, and frontend workloads, the Kubernetes infrastructure represents the operational backbone of the entire platform. This chapter provides a comprehensive guide to the cluster architecture, namespace topology, Helm chart patterns, data infrastructure deployments, network policies, and autoscaling strategies that keep MATIH running at scale.

What You Will Learn

By the end of this chapter, you will understand:

Cluster architecture across Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), and Google Kubernetes Engine (GKE), including node pool design, networking models, and identity integration
Namespace topology for the seven MATIH namespaces, their isolation boundaries, RBAC policies, resource quotas, and inter-namespace communication patterns
Helm chart structure including the standard per-service chart template with deployment, service, configmap, secret, ingress, HPA, PDB, ServiceMonitor, NetworkPolicy, and helper templates
Umbrella charts for the matih-control-plane (10 services) and matih-data-plane (14 services), their dependency management, and value override strategies
Data infrastructure including Trino, Kafka/Strimzi, PostgreSQL, Redis, Neo4j, Qdrant, MongoDB, Elasticsearch, ChromaDB, Dgraph, and StarRocks deployments
Network policies enforcing namespace isolation, service-to-service communication rules, and external access controls
Autoscaling patterns with Horizontal Pod Autoscalers (HPA), Vertical Pod Autoscalers (VPA), custom Prometheus metrics, and Pod Disruption Budgets (PDB)

Chapter Structure

Section	Description	Audience
Cluster Architecture	AKS, EKS, and GKE cluster setup, node pools, networking, and identity configuration	Platform engineers, DevOps
Namespace Topology	All seven namespaces with isolation, RBAC, resource quotas, and communication patterns	Platform engineers, security teams
Helm Chart Structure	Standard chart template, helper functions, values patterns, and template authoring	DevOps engineers, developers
Umbrella Charts	Control plane and data plane umbrella charts, dependency management, and deep merge behavior	DevOps engineers, release managers
Data Infrastructure	Stateful data services: Trino, Kafka, PostgreSQL, Redis, Neo4j, Qdrant, and more	Data engineers, platform engineers
Network Policies	Network isolation, ingress/egress rules, and service mesh considerations	Security engineers, platform engineers
Autoscaling Patterns	HPA, VPA, custom metrics, PDB, and scaling behavior configuration	Platform engineers, SREs

Kubernetes at a Glance

The MATIH platform deploys across seven namespaces on a managed Kubernetes cluster, with each namespace serving a distinct operational purpose:

+------------------------------------------------------------------+
|                     Kubernetes Cluster                            |
|                                                                   |
|  +--------------------+  +---------------------+                 |
|  | matih-system       |  | matih-observability  |                |
|  | (Platform infra)   |  | (Prometheus, Grafana |                |
|  |                    |  |  Loki, Tempo)        |                |
|  +--------------------+  +---------------------+                 |
|                                                                   |
|  +--------------------+  +---------------------+                 |
|  | matih-control-     |  | matih-monitoring-    |                |
|  | plane              |  | control-plane        |                |
|  | (IAM, Tenant,      |  | (CP ServiceMonitors) |                |
|  |  Config, Audit,    |  +---------------------+                 |
|  |  Notification)     |                                          |
|  +--------------------+  +---------------------+                 |
|                          | matih-monitoring-    |                 |
|  +--------------------+  | data-plane           |                |
|  | matih-data-plane   |  | (DP ServiceMonitors) |                |
|  | (AI, BI, ML, Query |  +---------------------+                 |
|  |  Catalog, Pipeline |                                          |
|  |  + Data Infra)     |  +---------------------+                |
|  +--------------------+  | matih-frontend       |                |
|                          | (BI, ML, Data, Agent |                |
|                          |  Workbenches)        |                |
|                          +---------------------+                 |
+------------------------------------------------------------------+

Key Design Principles

The MATIH Kubernetes infrastructure follows several foundational design principles:

1. Security by Default

Every service runs with a hardened security posture:

Non-root execution: All containers run as non-root user (UID 1000 or 1001)
Read-only root filesystem: Containers cannot write to their root filesystem
Capability dropping: All Linux capabilities are dropped with capabilities.drop: [ALL]
Privilege escalation prevention: allowPrivilegeEscalation: false on every container
Network isolation: NetworkPolicies restrict traffic to explicitly allowed paths

2. Consistent Chart Patterns

Every service chart follows an identical structure:

File	Purpose
`Chart.yaml`	Chart metadata and dependencies
`values.yaml`	Production defaults
`values-dev.yaml`	Development overrides
`templates/_helpers.tpl`	Reusable template functions
`templates/deployment.yaml`	Deployment specification
`templates/service.yaml`	ClusterIP Service
`templates/configmap.yaml`	Non-sensitive configuration
`templates/secret.yaml`	Sensitive data references
`templates/ingress.yaml`	Optional Ingress resource
`templates/hpa.yaml`	Horizontal Pod Autoscaler
`templates/pdb.yaml`	Pod Disruption Budget
`templates/servicemonitor.yaml`	Prometheus ServiceMonitor
`templates/networkpolicy.yaml`	Network isolation rules
`templates/NOTES.txt`	Post-install instructions

3. Cloud-Agnostic Design

MATIH runs on AKS, EKS, and GKE with identical Helm charts. Cloud-specific concerns (identity, storage classes, load balancers) are abstracted through:

Terraform modules per cloud provider
Values overlay files per environment
External Secrets Operator for secret management
cert-manager for TLS certificate provisioning

4. Observable Everything

Every service exposes Prometheus metrics, structured logs, and distributed traces:

Metrics: ServiceMonitor CRDs for automatic Prometheus scraping
Logs: Structured JSON logging collected by Fluent-bit/Promtail into Loki
Traces: OpenTelemetry instrumentation with Tempo as the backend
Health: Startup, liveness, and readiness probes on every container

Resource Summary

The following table summarizes the platform's Kubernetes resource footprint at a glance:

Metric	Count
Namespaces	7
Helm charts (total)	55+
Control plane services	10
Data plane services	14
Frontend applications	6
Data infrastructure components	12+
Custom Prometheus alert rules	20+
Network policies	25+
Horizontal Pod Autoscalers	20+
Pod Disruption Budgets	20+

Prerequisites

Before diving into this chapter, you should be familiar with:

Kubernetes fundamentals (Pods, Deployments, Services, ConfigMaps, Secrets)
Helm 3 chart structure and templating with Go templates
Basic networking concepts (DNS, TLS, network policies)
Container security fundamentals (Linux capabilities, user namespaces)
At least one managed Kubernetes provider (AKS, EKS, or GKE)

For installation and initial cluster provisioning, refer to Chapter 4: Installation and Deployment.

Navigation

Proceed to the first section to understand the cluster architecture across all three supported cloud providers:

Next: Cluster Architecture -- AKS, EKS, and GKE cluster setup

Export & Embedding Cluster Architecture