MATIH Platform is in active MVP development. Documentation reflects current implementation status.
18. CI/CD & Build System
Stage 15: AI Infrastructure

Stage 15: AI Infrastructure

Stage 15 deploys local AI inference components: vLLM for open-source model serving, Triton Inference Server, and the MATIH Copilot service. Cloud AI providers (Azure OpenAI, AWS Bedrock, GCP Vertex AI) are provisioned on-demand by the TenantService, not at deploy time.

Source file: scripts/stages/15-ai-infrastructure.sh


Components Deployed

ComponentPurpose
vLLMOpen-source LLM inference server
Triton Inference ServerMulti-framework model serving
MATIH CopilotCode and query assistance service

CPU Mode

In environments without GPU nodes, the stage automatically enables CPU mode:

SettingDefaultDescription
FORCE_CPU_MODEtrueDeploy vLLM/Triton on CPU nodes (slower but functional)

Image Tag Resolution

The stage reads image tags from build metadata, matching the pattern used by Stage 08:

# Priority: metadata JSON > tag file > IMAGE_TAG env > "latest"
if [[ -f "$METADATA_FILE" ]]; then
    IMAGE_TAG=$(jq -r '.imageTag // "latest"' "$METADATA_FILE")
elif [[ -f "$TAG_FILE" ]]; then
    IMAGE_TAG=$(cat "$TAG_FILE")
fi

Cloud AI Provisioning

Cloud AI providers are not deployed in this stage. They are provisioned per-tenant by the TenantService:

ProviderProvisioned ByWhen
Azure OpenAITenantService / InfrastructureServiceTenant creation
AWS BedrockTenantServiceTenant creation
GCP Vertex AITenantServiceTenant creation

Libraries Used

LibraryPurpose
core/config.shTerraform output access
k8s/namespace.shNamespace management
k8s/secrets.shSecret management
helm/deploy.shDeployment functions
azure/aks.shAKS node pool operations
acr/deploy.shACR image operations

Dependencies

  • Requires: 11-compute-engines, 14-ml-infrastructure
  • Required by: 16-data-plane-services

Dependency Verification

kubectl get pods -n matih-data-plane -l app=vllm
kubectl get pods -n matih-data-plane -l app=matih-copilot