Amazon Elastic Kubernetes Service (EKS)
Amazon EKS is a fully supported deployment target for MATIH. The cluster uses VPC CNI for pod networking, IAM Roles for Service Accounts (IRSA) for pod-level AWS access, and integration with AWS Secrets Manager for secret management.
Cluster Configuration
EKS clusters are provisioned through the Terraform module at infrastructure/terraform/modules/aws/eks/:
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = "matih-${var.environment}"
cluster_version = "1.29"
vpc_id = var.vpc_id
subnet_ids = var.private_subnet_ids
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
enable_irsa = true
cluster_addons = {
coredns = { most_recent = true }
kube-proxy = { most_recent = true }
vpc-cni = { most_recent = true }
aws-ebs-csi-driver = { most_recent = true }
}
}Managed Node Groups
EKS uses managed node groups with instance type selection per workload:
| Node Group | Instance Type | Min/Max | Purpose | Taint |
|---|---|---|---|---|
| system | m5.xlarge | 3/3 | System components | None |
| ctrlplane | m5.xlarge | 2/5 | Control plane services | matih.ai/control-plane=true:NoSchedule |
| dataplane | m5.2xlarge | 2/10 | Data plane services | matih.ai/data-plane=true:NoSchedule |
| compute | r5.4xlarge | 2/10 | Trino, Spark workers | matih.ai/compute=true:NoSchedule |
| aicompute | m5.2xlarge | 1/8 | AI/ML workloads | matih.ai/ai-compute=true:NoSchedule |
| gpu | p3.2xlarge | 0/4 | GPU inference | nvidia.com/gpu=true:NoSchedule |
IAM Roles for Service Accounts (IRSA)
IRSA enables pods to assume IAM roles via annotated service accounts:
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-secrets
namespace: external-secrets
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789:role/matih-external-secrets"The following services use IRSA:
| Service | IAM Role | Purpose |
|---|---|---|
| external-secrets | matih-external-secrets | AWS Secrets Manager access |
| cert-manager | matih-cert-manager | Route53 DNS validation |
| ai-service | matih-ai-bedrock | AWS Bedrock LLM inference |
| data-plane-agent | matih-s3-access | S3 data lake access |
VPC CNI Configuration
EKS uses the Amazon VPC CNI plugin with the following settings:
| Setting | Value |
|---|---|
| Network plugin | amazon-vpc-cni |
| Pod networking | Native VPC IP allocation |
| Network policy | Calico (add-on) |
| Service CIDR | 172.20.0.0/16 |
| Max pods per node | Instance-dependent |
Storage Classes
EKS provides EBS-backed storage classes via the CSI driver:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3
provisioner: ebs.csi.aws.com
parameters:
type: gp3
fsType: ext4
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true| Storage Class | EBS Type | Use Case |
|---|---|---|
| gp3 | General Purpose SSD | Default for most workloads |
| io2 | Provisioned IOPS SSD | PostgreSQL, ClickHouse |
| st1 | Throughput Optimized HDD | Kafka log segments, archival |
ECR Integration
For EKS deployments, images are stored in Amazon Elastic Container Registry (ECR):
global:
imageRegistry: 123456789.dkr.ecr.us-west-2.amazonaws.com/matih
imagePullSecrets:
- name: ecr-secretThe kubelet automatically refreshes ECR credentials via the ecr-credential-helper.