MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
ML Integration
Experiment Tracking

Experiment Tracking

The Experiment Tracking integration provides a conversational interface for managing ML experiments, comparing runs, and analyzing training metrics. It wraps MLflow Tracking functionality through the ML Service, enabling users to query experiment data and visualize results through natural language.


Tracking Architecture

Experiment tracking data flows through the ML Service to the MLflow backend:

AI Service (Tracking API) --> ML Service --> MLflow Tracking Server --> PostgreSQL (metrics store)
                                                                    --> MinIO/S3 (artifact store)

Core Concepts

ConceptDescription
ExperimentA named collection of related training runs
RunA single training execution with parameters, metrics, and artifacts
MetricA numeric value logged during training (loss, accuracy, F1)
ParameterA hyperparameter value used in the run
ArtifactA file produced by the run (model, plots, data samples)
TagA key-value label for organizing and filtering runs

Create Experiment

POST /api/v1/ml/experiments
{
  "name": "churn-prediction-q1",
  "description": "Customer churn prediction experiments for Q1 2025",
  "tags": {
    "project": "customer-retention",
    "team": "data-science"
  }
}

List Experiments

GET /api/v1/ml/experiments?tenant_id=acme-corp

Response

{
  "experiments": [
    {
      "id": "exp-001",
      "name": "churn-prediction-q1",
      "run_count": 24,
      "best_metric": {"f1_score": 0.912},
      "status": "active",
      "created_at": "2025-01-15T08:00:00Z"
    }
  ]
}

List Runs

GET /api/v1/ml/experiments/:experiment_id/runs?sort_by=metrics.f1_score&order=desc

Response

{
  "runs": [
    {
      "run_id": "run-017",
      "status": "completed",
      "parameters": {
        "algorithm": "xgboost",
        "n_estimators": 200,
        "learning_rate": 0.05
      },
      "metrics": {
        "f1_score": 0.912,
        "accuracy": 0.95,
        "training_loss": 0.082
      },
      "duration_seconds": 342,
      "created_at": "2025-03-15T10:00:00Z"
    }
  ]
}

Compare Runs

Compares multiple runs side by side for metric and parameter analysis:

POST /api/v1/ml/experiments/:experiment_id/compare
{
  "run_ids": ["run-015", "run-016", "run-017"],
  "metrics": ["f1_score", "accuracy", "auc_roc"],
  "parameters": ["n_estimators", "learning_rate", "max_depth"]
}

Response

{
  "comparison": {
    "runs": [
      {"run_id": "run-015", "f1_score": 0.88, "accuracy": 0.93},
      {"run_id": "run-016", "f1_score": 0.90, "accuracy": 0.94},
      {"run_id": "run-017", "f1_score": 0.912, "accuracy": 0.95}
    ],
    "best_run": "run-017",
    "parameter_impact": {
      "learning_rate": {"correlation_with_f1": -0.72},
      "n_estimators": {"correlation_with_f1": 0.85}
    }
  }
}

Metric History

Retrieves metric values logged over training steps for a specific run:

GET /api/v1/ml/experiments/:experiment_id/runs/:run_id/metrics/:metric_name

Response

{
  "metric": "training_loss",
  "steps": [
    {"step": 0, "value": 0.693, "timestamp": "2025-03-15T10:00:00Z"},
    {"step": 10, "value": 0.412, "timestamp": "2025-03-15T10:01:00Z"},
    {"step": 20, "value": 0.185, "timestamp": "2025-03-15T10:02:00Z"},
    {"step": 30, "value": 0.082, "timestamp": "2025-03-15T10:03:00Z"}
  ]
}

Conversational Queries

Users can query experiment data through natural language:

User QueryResolved Action
"Show me all experiments for churn prediction"List experiments filtered by tag
"Which run had the highest F1 score?"Sort runs by metric descending
"Compare the last 3 runs"Side-by-side metric comparison
"Plot the training loss curve for run 17"Metric history visualization

Configuration

Environment VariableDefaultDescription
MLFLOW_TRACKING_URIhttp://mlflow:5000MLflow server URL
TRACKING_MAX_RUNS_PER_QUERY100Max runs returned per query
TRACKING_METRIC_HISTORY_LIMIT1000Max metric history steps