MATIH Platform is in active MVP development. Documentation reflects current implementation status.
13. ML Service & MLOps
Inference & Serving
Shadow Deployment

Shadow Deployment

Shadow deployment enables validation of new model versions by mirroring production traffic without affecting live predictions. The shadow model receives the same inputs as the primary model, and its predictions are recorded for offline comparison but never returned to the client. This is the safest way to validate model changes before canary or full rollout.


Shadow Architecture

Client Request --> Primary Model --> Response to Client
                        |
                   (mirror)
                        |
                  Shadow Model --> Predictions Logged (not served)
                                        |
                                  Comparison Analytics

Create Shadow Deployment

POST /api/v1/inference/shadow
{
  "primary_model": "churn-xgb-v2",
  "shadow_model": "churn-xgb-v3",
  "sample_rate": 1.0,
  "comparison_metrics": ["accuracy", "f1_score", "latency_ms"],
  "duration_hours": 48,
  "auto_promote": false,
  "promotion_criteria": {
    "metric": "f1_score",
    "min_improvement": 0.005,
    "max_latency_increase_ms": 10
  }
}

Response

{
  "shadow_id": "shadow-abc123",
  "status": "active",
  "primary_model": "churn-xgb-v2",
  "shadow_model": "churn-xgb-v3",
  "started_at": "2025-03-15T10:00:00Z",
  "expires_at": "2025-03-17T10:00:00Z"
}

Get Shadow Comparison

GET /api/v1/inference/shadow/:shadow_id/comparison
{
  "shadow_id": "shadow-abc123",
  "total_predictions": 15000,
  "agreement_rate": 0.94,
  "comparison": {
    "primary": {
      "model": "churn-xgb-v2",
      "accuracy": 0.912,
      "f1_score": 0.895,
      "avg_latency_ms": 12.3
    },
    "shadow": {
      "model": "churn-xgb-v3",
      "accuracy": 0.925,
      "f1_score": 0.908,
      "avg_latency_ms": 13.1
    },
    "improvement": {
      "accuracy": 0.013,
      "f1_score": 0.013,
      "latency_ms": 0.8
    }
  },
  "promotion_eligible": true,
  "recommendation": "Shadow model shows consistent improvement; eligible for canary promotion"
}

Shadow Modes

ModeDescriptionSample Rate
Full mirrorEvery request is shadowed1.0 (100%)
SampledRandom subset of requests0.01 - 0.99
ConditionalOnly shadow requests matching criteriaRule-based

Comparison Metrics

MetricComparison TypeAlert If
Prediction agreementPercentage of matching predictionsBelow 90%
AccuracyAbsolute improvementDegradation detected
F1 scoreAbsolute improvementDegradation detected
LatencyMillisecond differenceIncrease above threshold
Error ratePercentage differenceShadow errors higher

Shadow Lifecycle

  1. Created: Shadow deployment configured but not yet active
  2. Active: Traffic is being mirrored to the shadow model
  3. Analyzing: Data collection complete, running comparison
  4. Completed: Analysis finished, promotion decision available
  5. Promoted: Shadow model promoted to primary (if auto-promote)
  6. Expired: Duration exceeded without promotion

Configuration

Environment VariableDefaultDescription
SHADOW_MAX_ACTIVE3Max concurrent shadow deployments
SHADOW_DEFAULT_DURATION_HOURS48Default shadow duration
SHADOW_MIN_SAMPLES1000Minimum samples before comparison
SHADOW_STORAGE_RETENTION_DAYS30Prediction log retention