MATIH Platform is in active MVP development. Documentation reflects current implementation status.
17. Kubernetes & Helm
Data Plane Charts
AI Service

AI Service Chart

The AI Service is the core intelligence engine of MATIH, providing natural language to SQL conversion, conversational analytics, LLM orchestration, and agent-based workflows. It is the most complex service chart with GPU support, multi-provider LLM configuration, and extensive infrastructure connections.


Chart Configuration

# From infrastructure/helm/ai-service/values.yaml
billing:
  costCenter: "CC-ML"
  application: "data-plane"
  team: "ml-engineering"
  workloadType: "api"
  service: "ai-service"
 
replicaCount: 2
 
image:
  registry: matihlabsacr.azurecr.io
  repository: matih/ai-service
  tag: ""
  pullPolicy: Always
 
service:
  type: ClusterIP
  port: 8000
  targetPort: 8000
 
resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 4Gi
    nvidia.com/gpu: 0  # Set to 1 for GPU inference

Autoscaling Configuration

The AI service uses a conservative HPA profile with custom Prometheus metrics:

autoscaling:
  enabled: true
  profile: conservative
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 60
  targetMemoryUtilizationPercentage: 70
  prometheusMetrics:
    - name: ai_service_inference_requests_per_second
      targetAverageValue: "20"
    - name: ai_service_inference_latency_seconds_p95
      targetAverageValue: "2"
    - name: ai_service_active_requests
      targetAverageValue: "15"
    - name: ai_service_llm_token_usage_rate
      targetAverageValue: "5000"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 120

LLM Provider Configuration

The AI service supports multiple LLM providers with cloud-native authentication:

providers:
  azure:
    enabled: true
    deploymentName: "gpt-4o"
    deploymentMini: "gpt-4o-mini"
    deploymentEmbedding: "text-embedding-3-large"
  openai:
    enabled: true
    defaultModel: "gpt-4-turbo-preview"
  anthropic:
    enabled: true
    defaultModel: "claude-3-5-sonnet-20241022"
  vertexai:
    enabled: false
    useWorkloadIdentity: true
  bedrock:
    enabled: false
    useIRSA: true
  vllm:
    enabled: false
    baseUrl: "http://vllm:8000"

API keys are sourced from Kubernetes secrets, never hardcoded:

- name: OPENAI_API_KEY
  valueFrom:
    secretKeyRef:
      name: ai-service-secrets
      key: openai-api-key
- name: AZURE_OPENAI_API_KEY
  valueFrom:
    secretKeyRef:
      name: ai-service-secrets
      key: azure-api-key

Infrastructure Connections

The AI service connects to numerous infrastructure components:

ComponentHost (FQDN)PortPurpose
PostgreSQLpostgresql.matih-data-plane.svc5432Persistent storage
Redisredis-master.matih-data-plane.svc6379Cache, sessions
Kafkastrimzi-kafka-kafka-bootstrap...svc9093Event streaming (TLS)
Qdrantqdrant.matih-data-plane.svc6333Vector embeddings
Trinotrino.matih-data-plane.svc8080SQL execution
ClickHouseclickhouse.matih-data-plane.svc8123OLAP queries
Spark Connectspark-connect.matih-data-plane.svc15002Complex analytics
Polarispolaris.matih-data-plane.svc8181Iceberg catalog
OpenMetadataopenmetadata.matih-data-plane.svc8585Data catalog
Query Enginequery-engine.matih-data-plane.svc8080SQL routing
Semantic Layersemantic-layer.matih-data-plane.svc8086Semantic models
IAM Serviceiam-service.matih-control-plane.svc8081Auth (cross-namespace)

Kafka Topics

The AI service creates Strimzi KafkaTopic CRDs for domain events:

kafkaTopics:
  enabled: true
  clusterName: "strimzi-kafka"
  topics:
    stateChanges:
      name: "matih.ai.state-changes"
      partitions: 12
      retentionMs: "2592000000"  # 30 days
    agentTraces:
      name: "matih.ai.agent-traces"
      partitions: 12
    evaluations:
      name: "matih.ai.evaluations"
      partitions: 6
      retentionMs: "7776000000"  # 90 days
    llmOps:
      name: "matih.ai.llm-ops"
      partitions: 12
      retentionMs: "604800000"   # 7 days

VPA Configuration

vpa:
  enabled: true
  updateMode: "Off"  # Recommendations only - never auto-restart AI services
  minAllowed:
    cpu: "500m"
    memory: "2Gi"
  maxAllowed:
    cpu: "8"
    memory: "32Gi"

Init Container: Database Migration

When using PostgreSQL backend, an Alembic migration init container runs before the main application:

initContainers:
  - name: alembic-migrate
    image: "{{ image }}"
    command: ["python", "-m", "alembic", "upgrade", "head"]
    env:
      - name: DATABASE_URL
        value: "postgresql+asyncpg://..."

Module Feature Flags

The AI service supports modular deployment with feature flags:

modules:
  core: true           # Agents, LLM, guardrails
  biPlatform: true     # BI analytics
  mlPlatform: true     # ML training/serving
  dataPlatform: true   # dbt, quality, pipeline
  contextGraph: true   # Context graph, ontology
  enterprise: true     # Security, performance
  supplementary: true  # FDME, search, DNN builder