Scaling Procedures

This runbook covers horizontal and vertical scaling of MATIH services in response to increased load, resource alerts, or tenant growth.

Symptoms

HighCPUUsage or HighMemoryUsage alerts
Elevated request latency
Request queue depth increasing
New tenant onboarding requiring additional capacity

Horizontal Scaling (Replicas)

Scale a Service

Horizontal scaling is managed through Helm values. Update the replica count in the appropriate values file and redeploy:

./scripts/tools/service-build-deploy.sh <service-name>

Recommended Replica Counts

Service	Dev	Staging	Production
AI Service	1	2	3-5
Query Engine	1	2	3-5
API Gateway	1	2	3
IAM Service	1	2	2
Tenant Service	1	1	2

Horizontal Pod Autoscaler

For production, configure HPA for automatic scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ai-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Vertical Scaling (Resources)

Increase Resource Limits

Update resource requests and limits in the Helm values:

resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 4Gi

Then redeploy:

./scripts/tools/service-build-deploy.sh <service-name>

Resource Guidelines

Service	CPU Request	CPU Limit	Memory Request	Memory Limit
AI Service	500m	2000m	1Gi	4Gi
Query Engine	250m	1000m	512Mi	2Gi
API Gateway	100m	500m	256Mi	1Gi

Database Scaling

PostgreSQL

Vertical: Increase pod resource limits
Read replicas: Add read replicas for read-heavy workloads
Connection pooling: Deploy PgBouncer for connection management

Redis

Vertical: Increase memory limits
Clustering: Enable Redis Cluster for horizontal scaling

Verification

After scaling:

Run platform status check
Verify all new replicas are healthy
Monitor resource utilization to confirm the scaling resolved the issue
Check that response latencies have improved

./scripts/tools/platform-status.sh
./scripts/disaster-recovery/health-check.sh

Kafka Recovery Certificate Renewal