MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Cluster Setup
Google GKE

Google Kubernetes Engine (GKE)

Google GKE is a fully supported deployment target for MATIH. The cluster uses GKE native networking, Workload Identity Federation for pod-level GCP access, and integration with GCP Secret Manager.


Cluster Configuration

GKE clusters are provisioned through the Terraform module at infrastructure/terraform/modules/gcp/gke/:

resource "google_container_cluster" "primary" {
  name     = "matih-${var.environment}"
  location = var.region
 
  release_channel {
    channel = "REGULAR"
  }
 
  network    = var.network_id
  subnetwork = var.subnetwork_id
 
  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"
    services_secondary_range_name = "services"
  }
 
  workload_identity_config {
    workload_pool = "${var.project_id}.svc.id.goog"
  }
 
  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block  = "172.16.0.0/28"
  }
 
  addons_config {
    gce_persistent_disk_csi_driver_config {
      enabled = true
    }
    network_policy_config {
      disabled = false
    }
  }
}

Node Pools

GKE uses dedicated node pools with autoscaling:

Node PoolMachine TypeMin/MaxPurposeTaint
systeme2-standard-43/3System componentsNone
ctrlplanee2-standard-42/5Control plane servicesmatih.ai/control-plane=true:NoSchedule
dataplanee2-standard-82/10Data plane servicesmatih.ai/data-plane=true:NoSchedule
computee2-highmem-162/10Trino, Spark workersmatih.ai/compute=true:NoSchedule
aicomputee2-standard-81/8AI/ML workloadsmatih.ai/ai-compute=true:NoSchedule
gpua2-highgpu-1g0/4GPU inference (A100)nvidia.com/gpu=true:NoSchedule

Workload Identity Federation

GKE Workload Identity Federation maps Kubernetes service accounts to GCP service accounts:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-secrets
  namespace: external-secrets
  annotations:
    iam.gke.io/gcp-service-account: "external-secrets@matih-project.iam.gserviceaccount.com"
ServiceGCP Service AccountPurpose
external-secretsexternal-secrets@projectSecret Manager access
cert-managercert-manager@projectCloud DNS validation
ai-serviceai-service@projectVertex AI inference
data-plane-agentdata-agent@projectGCS data lake access

Networking

GKE uses VPC-native networking with alias IP ranges:

SettingValue
Network modeVPC-native
Pod CIDRSecondary range "pods"
Service CIDRSecondary range "services"
Network policyCalico (GKE add-on)
Private clusterEnabled
Master authorized networksConfigured per environment

Storage Classes

GKE provides PD-backed storage classes:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ssd
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-ssd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Storage ClassDisk TypeUse Case
ssdpd-ssdDefault for databases, stateful workloads
balancedpd-balancedGeneral purpose
standardpd-standardNon-critical, archival storage

Artifact Registry

For GKE deployments, images are stored in Google Artifact Registry:

global:
  imageRegistry: us-central1-docker.pkg.dev/matih-project/matih
  imagePullSecrets:
    - name: gar-secret