MATIH Platform is in active MVP development. Documentation reflects current implementation status.
19. Observability & Operations
Scaling Procedures

Scaling Procedures

This runbook covers horizontal and vertical scaling of MATIH services in response to increased load, resource alerts, or tenant growth.


Symptoms

  • HighCPUUsage or HighMemoryUsage alerts
  • Elevated request latency
  • Request queue depth increasing
  • New tenant onboarding requiring additional capacity

Horizontal Scaling (Replicas)

Scale a Service

Horizontal scaling is managed through Helm values. Update the replica count in the appropriate values file and redeploy:

./scripts/tools/service-build-deploy.sh <service-name>

Recommended Replica Counts

ServiceDevStagingProduction
AI Service123-5
Query Engine123-5
API Gateway123
IAM Service122
Tenant Service112

Horizontal Pod Autoscaler

For production, configure HPA for automatic scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ai-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Vertical Scaling (Resources)

Increase Resource Limits

Update resource requests and limits in the Helm values:

resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 4Gi

Then redeploy:

./scripts/tools/service-build-deploy.sh <service-name>

Resource Guidelines

ServiceCPU RequestCPU LimitMemory RequestMemory Limit
AI Service500m2000m1Gi4Gi
Query Engine250m1000m512Mi2Gi
API Gateway100m500m256Mi1Gi

Database Scaling

PostgreSQL

  • Vertical: Increase pod resource limits
  • Read replicas: Add read replicas for read-heavy workloads
  • Connection pooling: Deploy PgBouncer for connection management

Redis

  • Vertical: Increase memory limits
  • Clustering: Enable Redis Cluster for horizontal scaling

Verification

After scaling:

  1. Run platform status check
  2. Verify all new replicas are healthy
  3. Monitor resource utilization to confirm the scaling resolved the issue
  4. Check that response latencies have improved
./scripts/tools/platform-status.sh
./scripts/disaster-recovery/health-check.sh