MATIH Platform is in active MVP development. Documentation reflects current implementation status.
18. CI/CD & Build System
Service Deploy

Service Build and Deploy

For rapid iteration during development and targeted hotfixes in production, MATIH provides scripts to build and deploy individual services without running the full CD pipeline. The two primary scripts -- service-build-deploy.sh and full-service-rebuild.sh -- handle the complete workflow from source code to running Kubernetes pod.


Single Service Deployment

service-build-deploy.sh

./scripts/tools/service-build-deploy.sh <service-name> [options]

This script handles the complete lifecycle for a single service:

  1. Build the Docker image from source
  2. Tag the image with version and git SHA
  3. Push the image to ACR
  4. Deploy via Helm upgrade to the correct namespace
  5. Validate the deployment is healthy

Usage Examples

# Build and deploy ai-service
./scripts/tools/service-build-deploy.sh ai-service
 
# Build and deploy with a specific tag
./scripts/tools/service-build-deploy.sh ai-service --tag 1.0.0-hotfix1
 
# Build only (no deploy)
./scripts/tools/service-build-deploy.sh ai-service --build-only
 
# Deploy only (skip build, use existing image)
./scripts/tools/service-build-deploy.sh ai-service --deploy-only --tag 1.0.0-abc1234
 
# Deploy with dev values
./scripts/tools/service-build-deploy.sh ai-service --environment dev

Script Options

OptionDescriptionDefault
--tag <tag>Override image tag<version>-<git-sha>
--build-onlyBuild and push, do not deployfalse
--deploy-onlyDeploy existing image, do not buildfalse
--environment <env>Target environment (dev/staging/prod)dev
--namespace <ns>Override target namespaceAuto-detected
--no-waitDo not wait for rollout completionfalse
--dry-runShow commands without executingfalse

Execution Flow

service-build-deploy.sh ai-service
  |
  +-- 1. Detect service type (java/python/node)
  |     Source: scripts/config/components.yaml
  |
  +-- 2. Determine namespace
  |     ai-service -> matih-data-plane
  |
  +-- 3. Build Docker image
  |     docker build -t matihlabsacr.azurecr.io/matih/ai-service:1.0.0-abc1234
  |       -f data-plane/ai-service/Dockerfile .
  |
  +-- 4. Push to ACR
  |     docker push matihlabsacr.azurecr.io/matih/ai-service:1.0.0-abc1234
  |
  +-- 5. Helm upgrade
  |     helm upgrade --install ai-service
  |       infrastructure/helm/ai-service
  |       -f infrastructure/helm/ai-service/values.yaml
  |       -f infrastructure/helm/ai-service/values-dev.yaml
  |       --set image.tag=1.0.0-abc1234
  |       --namespace matih-data-plane
  |       --timeout 5m --wait
  |
  +-- 6. Validate rollout
        kubectl rollout status deployment/ai-service
          -n matih-data-plane --timeout=300s

Service-to-Namespace Mapping

The script automatically determines the correct namespace based on the service name:

ServiceNamespaceChart Location
iam-servicematih-control-planeinfrastructure/helm/iam-service
tenant-servicematih-control-planeinfrastructure/helm/tenant-service
config-servicematih-control-planeinfrastructure/helm/config-service
audit-servicematih-control-planeinfrastructure/helm/audit-service
notification-servicematih-control-planeinfrastructure/helm/notification-service
ai-servicematih-data-planeinfrastructure/helm/ai-service
bi-servicematih-data-planeinfrastructure/helm/bi-service
ml-servicematih-data-planeinfrastructure/helm/ml-service
query-enginematih-data-planeinfrastructure/helm/query-engine
catalog-servicematih-data-planeinfrastructure/helm/catalog-service
pipeline-servicematih-data-planeinfrastructure/helm/pipeline-service
semantic-layermatih-data-planeinfrastructure/helm/semantic-layer
render-servicematih-data-planeinfrastructure/helm/render-service
data-quality-servicematih-data-planeinfrastructure/helm/data-quality-service
bi-workbenchmatih-frontendinfrastructure/helm/frontend
ml-workbenchmatih-frontendinfrastructure/helm/frontend

Full Service Rebuild

full-service-rebuild.sh

For situations where a complete rebuild from base images is needed:

./scripts/tools/full-service-rebuild.sh [options]

This script:

  1. Rebuilds base images (optional)
  2. Rebuilds all service images without Docker cache
  3. Pushes all images to ACR
  4. Deploys all services via umbrella charts

Options

OptionDescription
--include-baseAlso rebuild base images before service images
--services <list>Comma-separated list of services to rebuild
--environment <env>Target environment
--parallelBuild services in parallel

When to Use Full Rebuild

ScenarioUse
Base image security updatefull-service-rebuild.sh --include-base
Dependency version bumpfull-service-rebuild.sh
Build cache corruptionfull-service-rebuild.sh (uses --no-cache)
New environment setupfull-service-rebuild.sh --environment staging

Deployment Strategies

Rolling Update (Default)

All MATIH services use rolling updates by default:

# Deployment strategy in Helm template
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

This means:

  • At most 1 extra pod is created during update
  • No pods are terminated until new pods are ready
  • Zero-downtime deployment

Blue-Green Deployment (Manual)

For critical services requiring instant rollback:

# Deploy new version alongside existing
helm upgrade --install ai-service-blue \
  infrastructure/helm/ai-service \
  --set image.tag=2.0.0-new \
  --set service.port=8001 \
  --namespace matih-data-plane
 
# Test the blue deployment
curl http://ai-service-blue.matih-data-plane:8001/api/v1/health
 
# Switch traffic (update ingress/service)
# ...
 
# Remove old deployment
helm uninstall ai-service-green --namespace matih-data-plane

Canary Deployment

For gradual rollout with traffic splitting:

# Deploy canary with reduced replicas
helm upgrade --install ai-service-canary \
  infrastructure/helm/ai-service \
  --set image.tag=2.0.0-canary \
  --set replicaCount=1 \
  --set autoscaling.enabled=false \
  --namespace matih-data-plane
 
# Monitor canary metrics
# If healthy, promote to full deployment
# If unhealthy, remove canary

Rollback Procedures

Helm Rollback

# View release history
helm history ai-service --namespace matih-data-plane
 
# Output:
# REVISION  STATUS      CHART              APP VERSION  DESCRIPTION
# 1         superseded  ai-service-1.0.0   1.0.0        Install complete
# 2         superseded  ai-service-1.0.0   1.0.0        Upgrade complete
# 3         deployed    ai-service-1.0.0   1.0.0        Upgrade complete
 
# Rollback to revision 2
helm rollback ai-service 2 --namespace matih-data-plane --timeout 5m
 
# Verify rollback
kubectl rollout status deployment/ai-service -n matih-data-plane

Image Tag Rollback

If you know the previous working image tag:

# Deploy with known-good image tag
./scripts/tools/service-build-deploy.sh ai-service \
  --deploy-only \
  --tag 1.0.0-previousgoodshahere

Post-Deployment Validation

After every deployment, the script validates:

Health Check Sequence

StepCheckPass CriteriaTimeout
1Rollout statusAll replicas ready300s
2Pod readinessAll readinessProbes pass60s
3HTTP health endpointHTTP 200 response30s
4Dependency connectivityDatabase, Redis, Kafka connected30s
# Automated post-deploy validation
echo "Checking deployment status..."
kubectl rollout status deployment/ai-service \
  -n matih-data-plane --timeout=300s
 
echo "Checking pod health..."
kubectl get pods -l app.kubernetes.io/name=ai-service \
  -n matih-data-plane -o wide
 
echo "Checking HTTP health endpoint..."
kubectl exec -n matih-data-plane deploy/ai-service -- \
  curl -sf http://localhost:8000/api/v1/health

Development Workflow

Typical Development Cycle

1. Make code changes locally
2. Run local tests: pytest tests/ -v
3. Build and deploy to dev cluster:
   ./scripts/tools/service-build-deploy.sh ai-service --environment dev
4. Verify in dev:
   ./scripts/tools/platform-status.sh
5. Open PR for review
6. After merge, CD pipeline deploys to staging
7. After staging validation, CD pipeline deploys to production

Fast Iteration Loop

For rapid development iteration:

# Build, push, and deploy (takes ~3-5 minutes for Python)
./scripts/tools/service-build-deploy.sh ai-service
 
# Check logs while deploying
kubectl logs -f deployment/ai-service -n matih-data-plane
 
# Quick health check
./scripts/tools/platform-status.sh

Troubleshooting

Common Deployment Issues

IssueSymptomResolution
Build failsDocker build errorCheck Dockerfile; verify base image exists
Push failsACR authentication errorRe-authenticate: az acr login --name matihlabsacr
Helm upgrade fails"release in failed state"Run helm rollback then retry
Pod CrashLoopBackOffContainer exits immediatelyCheck pod logs for application errors
ImagePullBackOffImage not found in registryVerify image tag was pushed successfully
Init container failsMigration errorCheck database connectivity and migration SQL
Readiness probe failsService not startingIncrease startupProbe failureThreshold

Next Steps