MATIH Platform is in active MVP development. Documentation reflects current implementation status.
8. Platform Services
Drift Detection

Drift Detection

The DriftDetectionService continuously monitors tenant infrastructure for configuration drift. Drift occurs when the actual infrastructure state deviates from the desired state defined in the platform, whether due to manual changes, external tools, or infrastructure failures.


How Drift Detection Works

  1. Desired state is stored in the DesiredInfrastructureState table
  2. Actual state is read from the cloud provider APIs and Kubernetes cluster
  3. Comparison identifies any differences between desired and actual
  4. Report generates a drift report with details of each discrepancy
  5. Remediation optionally auto-remediates or creates alerts for manual review

Drift Detection Endpoint

Endpoint: POST /api/v1/infrastructure/tenants/:tenantId/drift-check

Triggers an on-demand drift detection scan for a tenant.

curl -X POST http://localhost:8089/api/v1/infrastructure/tenants/550e8400/drift-check \
  -H "Authorization: Bearer ${TOKEN}"

Drift Report

Endpoint: GET /api/v1/infrastructure/tenants/:tenantId/drift-report

Returns the most recent drift detection results.

Report Structure

{
  "tenantId": "550e8400-e29b-41d4-a716-446655440000",
  "scanTimestamp": "2026-02-12T10:30:00Z",
  "hasDrift": true,
  "driftItems": [
    {
      "resourceType": "deployment",
      "resourceName": "ai-service",
      "field": "replicas",
      "desiredValue": "3",
      "actualValue": "2",
      "severity": "HIGH"
    },
    {
      "resourceType": "configmap",
      "resourceName": "ai-service-config",
      "field": "query.timeout",
      "desiredValue": "60s",
      "actualValue": "30s",
      "severity": "MEDIUM"
    }
  ],
  "summary": {
    "totalResources": 45,
    "driftedResources": 2,
    "healthyResources": 43
  }
}

Monitored Resources

Resource TypeWhat Is Checked
Kubernetes DeploymentsReplicas, image versions, resource limits, environment variables
Kubernetes ServicesPorts, selectors, type
ConfigMapsConfiguration values
SecretsExistence (not values)
IngressRules, TLS configuration, annotations
DatabaseSize, version, replication settings
StorageCapacity, performance tier
Network PoliciesIngress/egress rules

Scheduled Detection

Drift detection runs on a configurable schedule (default: every 30 minutes). The reconciler compares the DesiredInfrastructureState entries with live cluster state and publishes drift events.


Auto-Remediation

When drift is detected, the system can:

  1. Alert only -- Send notifications to administrators (default)
  2. Auto-remediate -- Automatically apply the desired state to correct drift
  3. Queue for review -- Create a remediation ticket for manual approval

Auto-remediation is configured per-tenant and per-resource-type to balance automation with change control.

⚠️

Auto-remediation should be carefully configured. Some drift (e.g., manual scaling during an incident) may be intentional. Review drift reports before enabling auto-remediation for production tenants.