Trino
Trino v458 serves as the primary federated SQL engine for MATIH, connecting to Iceberg tables via Polaris REST catalog, Delta Lake via Hive Metastore, and PostgreSQL for metadata queries.
Architecture
+----------------+ +------------------+
| Trino | | Trino Workers |
| Coordinator |---->| (2-10 replicas) |
| (1 replica) | | Autoscaled |
+----------------+ +------------------+
| |
v v
+----------------+ +------------------+
| Polaris REST | | MinIO / S3 |
| Catalog | | (Object Storage) |
+----------------+ +------------------+Resource Configuration
# From infrastructure/helm/trino/values.yaml
coordinator:
resources:
requests:
memory: "10Gi"
cpu: "5"
limits:
memory: "14Gi"
cpu: "7"
jvm:
maxHeap: "10G"
config:
queryMaxMemory: "80GB"
queryMaxMemoryPerNode: "12GB"
worker:
resources:
requests:
memory: "10Gi"
cpu: "5"
limits:
memory: "14Gi"
cpu: "7"
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70Catalog Configuration
Iceberg via Polaris REST Catalog
catalogs:
iceberg:
enabled: true
connectorName: iceberg
catalogType: rest
rest:
uri: "http://polaris:8181/api/catalog"
warehouse: "matih"
vendedCredentialsEnabled: true
oauth2:
enabled: true
tokenEndpoint: "http://polaris:8181/api/catalog/v1/oauth/tokens"
existingSecret: "polaris-trino-credentials"Delta Lake via Hive Metastore
catalogs:
delta:
enabled: true
connectorName: delta_lake
hiveMetastoreUri: "thrift://hive-metastore:9083"Authentication
Trino supports dual authentication: file-based passwords for service accounts and OAuth2/JWT for user access:
authentication:
enabled: true
type: PASSWORD,OAUTH2
passwordAuthenticator:
name: file
existingSecret: "trino-password-db"
oauth2:
enabled: true
jwksUrl: "http://iam-service.matih-control-plane.svc.cluster.local:8081/.well-known/jwks.json"
issuer: "matih-iam-service"Resource Groups for Tenant Isolation
resourceGroups:
enabled: true
selectors:
- group: "tenant-${SESSION.matih_tenant_id:-default}"
session.matih_tenant_id: ".*"
rootGroups:
- name: "tenant-*"
softMemoryLimit: "60%"
hardConcurrencyLimit: 100
subGroups:
- name: interactive
softMemoryLimit: "40%"
schedulingWeight: 10
- name: batch
softMemoryLimit: "30%"
schedulingWeight: 3Scheduling
Trino runs on dedicated compute nodes:
coordinator:
nodeSelector:
agentpool: compute
tolerations:
- key: "matih.ai/compute"
operator: "Equal"
value: "true"
effect: "NoSchedule"