MATIH Platform is in active MVP development. Documentation reflects current implementation status.
21. Industry Examples & Walkthroughs
Financial Services
Data Scientist Journey

Data Scientist Journey: Building a Credit Risk Scoring Model

Persona: Amir, Senior Data Scientist at Meridian Bank (Risk Analytics team, 6 years experience) Objective: Build a probability-of-default (PD) model for consumer loan applications to replace the legacy logistic regression scorecard Timeline: 4-week sprint from data exploration to champion-challenger deployment Datasets: credit_applications (200K), accounts (500K), transactions (50M), bureau_scores (450K), historical_defaults (35K)


Stage 1: Ingestion

Amir begins by connecting the data sources needed for credit risk modeling. He needs both live application data and historical outcomes to train a supervised model.

Connecting Core Banking PostgreSQL

In the Data Workbench, Amir opens the Ingestion panel and selects the Airbyte PostgreSQL connector. He configures the connection to the core banking system:

{
  "source": {
    "type": "postgres",
    "name": "core-banking-prod",
    "connection": {
      "host": "core-banking.meridian.internal",
      "port": 5432,
      "database": "meridian_core",
      "schema": "lending",
      "replication_method": "CDC",
      "publication": "matih_publication"
    },
    "tables": [
      "accounts",
      "credit_applications",
      "payment_history",
      "loan_disbursements"
    ],
    "sync_schedule": {
      "frequency": "daily",
      "time": "02:00",
      "timezone": "America/New_York"
    }
  }
}

Importing Credit Bureau Scores

The bureau data arrives as a monthly CSV file from three bureaus. Amir uses the File Import component in the Data Workbench:

FieldValue
File formatCSV (pipe-delimited)
File size~45 MB per month
Schema detectionAutomatic with overrides
Dedup keycustomer_id + score_date + bureau
Destinationrisk_analytics.bureau_scores

Connecting Snowflake for Historical Defaults

Historical default data spanning 10 years lives in the Snowflake analytics warehouse. Amir configures an incremental sync:

{
  "source": {
    "type": "snowflake",
    "name": "snowflake-analytics",
    "connection": {
      "account": "meridian.us-east-1",
      "warehouse": "ANALYTICS_WH",
      "database": "RISK_ANALYTICS",
      "schema": "HISTORICAL"
    },
    "tables": ["loan_defaults", "recovery_outcomes", "workout_history"],
    "sync_schedule": {
      "frequency": "daily",
      "time": "03:00",
      "timezone": "America/New_York"
    },
    "cursor_field": "updated_at"
  }
}

After configuring all three sources, Amir triggers an initial full sync. The sync status panel shows progress:

SourceTablesRows SyncedDurationStatus
core-banking-prod4751,2008m 12sComplete
bureau-scores-csv1450,0001m 45sComplete
snowflake-analytics385,4003m 22sComplete

Stage 2: Discovery

With data synced, Amir switches to the Data Catalog to explore what is available and assess data quality.

Catalog Exploration

Searching the catalog for "credit" returns 14 tables across 3 schemas. Amir focuses on these:

risk_analytics.credit_applications   200,432 rows   Last synced: 2026-02-28 02:15 UTC
risk_analytics.bureau_scores         450,118 rows   Last synced: 2026-02-28 03:00 UTC
lending.payment_history            3,842,901 rows   Last synced: 2026-02-28 02:18 UTC
historical.loan_defaults              35,219 rows   Last synced: 2026-02-28 03:22 UTC

Data Profiling

Amir runs the automated profiler on credit_applications. The Data Quality Service returns:

ColumnTypeCompletenessDistinctMinMaxIssues
application_idVARCHAR100%200,432----None
customer_idVARCHAR100%178,201----12.4% repeat applicants
income_reportedDECIMAL92.1%14,872$12,000$2,400,0007.9% missing
employment_lengthINTEGER98.3%52087Outliers > 50 years
requested_amountDECIMAL100%3,441$1,000$500,000Right-skewed
debt_to_incomeDECIMAL89.7%8,2190.0114.210.3% missing, extreme values
ssnVARCHAR100%178,201----PII - masked by policy

Amir notes two issues to address in feature engineering: missing income_reported values (7.9%) and employment_length outliers exceeding 50 years.

Lineage Tracing

The catalog's lineage view shows the data flow from application to outcome:

credit_applications ──▶ loan_disbursements ──▶ payment_history ──▶ loan_defaults
      (apply)               (approve)            (repay)           (default)
     200,432                 142,301            3,842,901            35,219

This lineage confirms Amir can join application features to default outcomes, which is the target variable for the model.


Stage 3: Query

Amir uses the Query Engine to build the risk feature set. He needs to federate across PostgreSQL (live data) and Snowflake (historical outcomes).

Building the Risk Feature Set

-- Credit Risk Feature Set: Federated query across PostgreSQL + Snowflake
-- Governance note: SSN is auto-masked; only hashed customer_id used for joins
 
WITH payment_features AS (
    -- Payment behavior features from core banking (PostgreSQL)
    SELECT
        ca.customer_id,
        ca.application_id,
        ca.income_reported,
        ca.employment_length,
        ca.requested_amount,
        ca.debt_to_income,
        -- Payment delinquency features
        COUNT(CASE WHEN ph.days_past_due > 30 THEN 1 END)
            AS dpd_30_count_12m,
        COUNT(CASE WHEN ph.days_past_due > 60 THEN 1 END)
            AS dpd_60_count_12m,
        COUNT(CASE WHEN ph.days_past_due > 90 THEN 1 END)
            AS dpd_90_count_12m,
        MAX(ph.days_past_due) AS max_dpd_12m,
        -- Utilization features
        AVG(ph.balance / NULLIF(ph.credit_limit, 0))
            AS avg_utilization_12m,
        MAX(ph.balance / NULLIF(ph.credit_limit, 0))
            AS max_utilization_12m
    FROM risk_analytics.credit_applications ca
    LEFT JOIN lending.payment_history ph
        ON ca.customer_id = ph.customer_id
        AND ph.payment_date >= ca.application_date - INTERVAL '12 months'
        AND ph.payment_date < ca.application_date
    GROUP BY ca.customer_id, ca.application_id,
             ca.income_reported, ca.employment_length,
             ca.requested_amount, ca.debt_to_income
),
bureau_features AS (
    -- Latest bureau score before application date
    SELECT DISTINCT ON (bs.customer_id)
        bs.customer_id,
        bs.score AS bureau_score,
        bs.num_inquiries AS inquiries_6m,
        bs.delinquencies AS bureau_delinquencies
    FROM risk_analytics.bureau_scores bs
    ORDER BY bs.customer_id, bs.score_date DESC
),
default_labels AS (
    -- Default outcome from Snowflake (historical data)
    SELECT
        ld.loan_id,
        ld.customer_id,
        1 AS defaulted,
        ld.loss_given_default
    FROM historical.loan_defaults ld
)
 
SELECT
    pf.application_id,
    pf.customer_id,
    -- Application features
    pf.income_reported,
    LEAST(pf.employment_length, 50) AS employment_length_capped,
    pf.requested_amount,
    pf.requested_amount / NULLIF(pf.income_reported, 0)
        AS loan_to_income,
    pf.debt_to_income,
    -- Payment behavior
    pf.dpd_30_count_12m,
    pf.dpd_60_count_12m,
    pf.dpd_90_count_12m,
    pf.max_dpd_12m,
    pf.avg_utilization_12m,
    pf.max_utilization_12m,
    -- Bureau features
    bf.bureau_score,
    bf.inquiries_6m,
    bf.bureau_delinquencies,
    -- Target variable
    COALESCE(dl.defaulted, 0) AS defaulted
FROM payment_features pf
LEFT JOIN bureau_features bf ON pf.customer_id = bf.customer_id
LEFT JOIN default_labels dl ON pf.customer_id = dl.customer_id;

The Query Engine executes this across both sources, returning 200,432 rows with 17 features and the target variable. Amir saves this as a materialized view risk_analytics.credit_risk_features.


Stage 4: Orchestration

Amir builds a daily pipeline to keep the risk feature set fresh. In the Pipeline Service, he defines a 4-stage DAG:

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Extract  │───▶│ Validate │───▶│ Engineer │───▶│  Load    │
│          │    │          │    │ Features │    │ Feature  │
│ Sync all │    │ Quality  │    │          │    │  Store   │
│ 3 sources│    │ gates    │    │ Run SQL  │    │          │
└──────────┘    └──────────┘    └──────────┘    └──────────┘
   5 min           2 min           8 min           3 min

Pipeline Configuration

{
  "pipeline": {
    "name": "credit-risk-feature-pipeline",
    "schedule": "0 4 * * *",
    "owner": "amir@meridian.bank",
    "tags": ["risk", "credit", "ml-features"],
    "stages": [
      {
        "name": "extract",
        "type": "sync_trigger",
        "sources": ["core-banking-prod", "bureau-scores-csv", "snowflake-analytics"],
        "timeout_minutes": 30
      },
      {
        "name": "validate",
        "type": "data_quality",
        "checks": [
          {
            "table": "credit_applications",
            "expectation": "expect_column_values_to_not_be_null",
            "column": "customer_id",
            "severity": "critical"
          },
          {
            "table": "credit_applications",
            "expectation": "expect_column_values_to_be_between",
            "column": "income_reported",
            "min_value": 0,
            "max_value": 10000000,
            "mostly": 0.92,
            "severity": "warning"
          },
          {
            "table": "bureau_scores",
            "expectation": "expect_table_row_count_to_be_between",
            "min_value": 400000,
            "max_value": 600000,
            "severity": "critical"
          }
        ],
        "on_failure": "halt_pipeline"
      },
      {
        "name": "engineer_features",
        "type": "sql_transform",
        "query_ref": "risk_analytics.credit_risk_features",
        "materialization": "table",
        "depends_on": ["validate"]
      },
      {
        "name": "load_feature_store",
        "type": "feature_store_ingest",
        "source_table": "risk_analytics.credit_risk_features",
        "feature_group": "credit_risk_v2",
        "entity_key": "application_id",
        "timestamp_column": "application_date",
        "depends_on": ["engineer_features"]
      }
    ]
  }
}

The data quality gate in stage 2 is critical: if customer_id has nulls or bureau_scores row count drops unexpectedly, the pipeline halts and sends an alert rather than propagating bad data into the feature store.


Stage 5: Analysis

Before model training, Amir validates the feature set for modeling fitness.

Class Imbalance Check

SELECT
    defaulted,
    COUNT(*) AS count,
    ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER(), 2) AS pct
FROM risk_analytics.credit_risk_features
GROUP BY defaulted;
defaultedcountpct
0194,41997.0%
16,0133.0%

The 3% default rate confirms class imbalance. Amir will use SMOTE oversampling and class-weighted loss during training.

Data Leakage Validation

Amir verifies no features contain information from after the application decision:

FeatureTemporal CheckResult
dpd_30_count_12mUses only pre-application payment historyPass
bureau_scoreLatest score before application datePass
avg_utilization_12m12-month window ending at applicationPass
loss_given_defaultPost-default metricExcluded (target leakage)

Feature Correlation Analysis

The Data Quality Service generates a correlation matrix. Key findings:

Feature PairCorrelationAction
bureau_score / dpd_30_count_12m-0.72Expected -- keep both (different signal sources)
debt_to_income / loan_to_income0.84High collinearity -- consider dropping one
max_utilization_12m / avg_utilization_12m0.91Keep max only for interpretability
income_reported / requested_amount0.31Low -- independent features

Amir decides to drop avg_utilization_12m (collinear with max) and keep both debt_to_income and loan_to_income (different denominators capture different risk signals).


Stage 6: Productionization

Model Training and Comparison

In the ML Workbench, Amir sets up an experiment to compare models:

{
  "experiment": {
    "name": "credit-risk-pd-model-v2",
    "tracking_uri": "mlflow://meridian.matih.internal",
    "runs": [
      {
        "name": "logistic-regression-baseline",
        "model_type": "sklearn.linear_model.LogisticRegression",
        "params": {
          "class_weight": "balanced",
          "C": 0.1,
          "max_iter": 1000
        }
      },
      {
        "name": "xgboost-candidate",
        "model_type": "xgboost.XGBClassifier",
        "params": {
          "n_estimators": 500,
          "max_depth": 6,
          "learning_rate": 0.05,
          "scale_pos_weight": 32.3,
          "subsample": 0.8,
          "colsample_bytree": 0.8,
          "eval_metric": "auc"
        }
      }
    ],
    "evaluation": {
      "split": "temporal",
      "train_end": "2025-06-30",
      "test_start": "2025-07-01",
      "metrics": ["auc", "gini", "ks_statistic", "log_loss", "precision_at_10pct"]
    }
  }
}

Model Comparison Results

MetricLogistic RegressionXGBoostImprovement
AUC0.7810.843+7.9%
Gini Coefficient0.5620.686+22.1%
KS Statistic0.3890.512+31.6%
Precision @ 10% recall42.1%58.7%+39.4%
Log Loss0.1420.108-23.9%

Model Registration with Fairness Metrics

Amir registers the XGBoost model in the Model Registry with a mandatory model card:

{
  "model_card": {
    "name": "credit-risk-pd-xgboost-v2",
    "version": "2.0.0",
    "owner": "amir@meridian.bank",
    "use_case": "Probability of default scoring for consumer loan applications",
    "training_data": {
      "source": "risk_analytics.credit_risk_features",
      "rows": 160346,
      "date_range": "2020-01-01 to 2025-06-30",
      "target_rate": "3.0%"
    },
    "performance": {
      "test_auc": 0.843,
      "test_gini": 0.686,
      "test_ks": 0.512
    },
    "fairness_metrics": {
      "disparate_impact_ratio": {
        "gender": 0.92,
        "age_group": 0.88,
        "ethnicity": 0.85
      },
      "equal_opportunity_difference": {
        "gender": 0.03,
        "age_group": 0.07,
        "ethnicity": 0.09
      },
      "threshold": "DIR > 0.80 (four-fifths rule)"
    },
    "governance": {
      "masked_columns": ["ssn", "date_of_birth"],
      "audit_logging": true,
      "regulatory_framework": "SR 11-7",
      "approval_required": true,
      "approved_by": null,
      "approval_date": null
    }
  }
}

Deployment to Ray Serve

Amir deploys the model for batch scoring of the loan portfolio:

{
  "deployment": {
    "model": "credit-risk-pd-xgboost-v2",
    "version": "2.0.0",
    "serving_engine": "ray_serve",
    "mode": "batch",
    "schedule": "0 6 * * *",
    "input": "risk_analytics.credit_risk_features",
    "output": "risk_analytics.pd_scores",
    "resources": {
      "num_replicas": 2,
      "num_cpus": 4,
      "memory_gb": 8
    },
    "governance": {
      "decision_audit_table": "audit.model_decisions",
      "log_inputs": true,
      "log_predictions": true,
      "log_feature_importance": true
    }
  }
}

Stage 7: Feedback

Amir sets up ongoing monitoring to detect model degradation.

Model Performance Monitoring

In the ML Workbench monitoring tab, Amir configures weekly performance tracking:

MetricBaselineWeek 1Week 2Week 3Week 4Alert Threshold
AUC0.8430.8410.8390.8350.832< 0.75
Gini0.6860.6820.6780.6700.664< 0.55
PSI (Population Stability Index)0.0000.0120.0180.0240.031> 0.10

Feature Drift Detection

{
  "drift_monitor": {
    "name": "credit-risk-feature-drift",
    "model": "credit-risk-pd-xgboost-v2",
    "schedule": "weekly",
    "reference_window": "2025-01-01 to 2025-06-30",
    "features_monitored": [
      {
        "feature": "income_reported",
        "method": "ks_test",
        "threshold": 0.05
      },
      {
        "feature": "debt_to_income",
        "method": "psi",
        "threshold": 0.10
      },
      {
        "feature": "bureau_score",
        "method": "wasserstein",
        "threshold": 15.0
      }
    ],
    "alerts": {
      "channel": "slack",
      "target": "#risk-model-alerts",
      "severity_mapping": {
        "single_feature_drift": "warning",
        "multiple_feature_drift": "critical",
        "performance_degradation": "critical"
      }
    }
  }
}

Regulatory Model Validation Metrics

For SR 11-7 compliance, Amir generates quarterly validation reports:

Validation TestQ1 ResultStatus
Discriminatory power (AUC)0.843Pass (> 0.70)
Calibration (Hosmer-Lemeshow)p = 0.34Pass (p > 0.05)
Stability (PSI vs development)0.031Pass (< 0.10)
Concentration (Herfindahl)0.08Pass (< 0.15)
Back-testing (predicted vs actual PD)3.2% vs 3.0%Pass (within 10%)

Stage 8: Experimentation

Amir runs a champion-challenger test to validate the new model against the existing scorecard in production.

Champion-Challenger Design

┌─────────────────────────────────────────────────────────┐
│              Incoming Loan Applications                  │
│                    (100% traffic)                        │
└────────────────────────┬────────────────────────────────┘

              ┌──────────┴──────────┐
              │   Random Assignment  │
              │    (by application)  │
              └──────────┬──────────┘
                    ┌────┴────┐
              ┌─────▼─────┐ ┌─▼───────────┐
              │ Champion  │ │ Challenger  │
              │ (70%)     │ │ (30%)       │
              │           │ │             │
              │ Current   │ │ XGBoost v2  │
              │ Scorecard │ │ AUC: 0.843  │
              │ AUC: 0.78 │ │             │
              └─────┬─────┘ └──────┬──────┘
                    │              │
              ┌─────▼──────────────▼──────┐
              │  Both scores logged for   │
              │  every application        │
              │  Decision uses assigned   │
              │  model only               │
              └───────────────────────────┘

Test Configuration

{
  "experiment": {
    "name": "credit-risk-champion-challenger-q1-2026",
    "type": "champion_challenger",
    "champion": {
      "model": "legacy-scorecard-v1",
      "traffic_pct": 70
    },
    "challenger": {
      "model": "credit-risk-pd-xgboost-v2",
      "traffic_pct": 30
    },
    "assignment": {
      "method": "random",
      "unit": "application_id",
      "stratify_by": ["loan_product", "branch_region"]
    },
    "duration_weeks": 12,
    "primary_metric": "auc_on_realized_defaults",
    "guardrail_metrics": [
      {
        "name": "approval_rate",
        "max_deviation": 0.05
      },
      {
        "name": "disparate_impact_ratio",
        "min_value": 0.80
      }
    ],
    "governance": {
      "regulatory_review_required": true,
      "reviewer": "model-validation-team@meridian.bank",
      "sr_11_7_documentation": true
    }
  }
}

Interim Results (Week 6 of 12)

MetricChampion (Scorecard)Challenger (XGBoost)Difference
Applications scored14,2106,090--
AUC (realized defaults)0.7790.838+7.6%
Approval rate68.2%71.4%+3.2pp
Expected loss (annualized)$4.2M$3.1M-$1.1M
False decline rate8.4%5.1%-3.3pp
Disparate impact (gender)0.890.91+0.02
Disparate impact (ethnicity)0.830.85+0.02

The challenger model shows higher approval rates with lower expected losses, meaning it better separates good borrowers from risky ones. Fairness metrics are within regulatory thresholds. Amir prepares the model validation documentation for the regulatory review committee before full production rollout.

Governance Audit Trail

Every model decision is logged to audit.model_decisions with full traceability:

SELECT
    decision_id,
    application_id,
    model_name,
    model_version,
    prediction_score,
    decision,
    feature_importance_top5,
    created_at
FROM audit.model_decisions
WHERE application_id = 'APP-2026-00142891'
ORDER BY created_at DESC
LIMIT 1;
FieldValue
decision_idDEC-9f3a2b...
application_idAPP-2026-00142891
model_namecredit-risk-pd-xgboost-v2
model_version2.0.0
prediction_score0.042 (4.2% PD)
decisionAPPROVE
feature_importance_top5bureau_score (0.23), dpd_30_count (0.18), debt_to_income (0.15), utilization (0.12), employment_length (0.09)

Key Takeaways

StageKey ActionPlatform Component
IngestionConnected 3 sources with scheduled syncsData Workbench, Airbyte connectors
DiscoveryProfiled data, found quality issues, traced lineageCatalog Service, Data Quality Service
QueryBuilt federated feature set across PostgreSQL + SnowflakeQuery Engine (SQL federation)
OrchestrationDaily pipeline with quality gatesPipeline Service (Temporal)
AnalysisValidated class balance, leakage, correlationsData Quality Service, SQL analytics
ProductionizationTrained, registered, deployed model with fairness metricsML Workbench, MLflow, Ray Serve
FeedbackWeekly performance monitoring, drift detectionML monitoring, alerting
ExperimentationChampion-challenger test with regulatory governanceExperiment framework, Governance Service

Related Walkthroughs