API Reference

This section provides comprehensive documentation for all ML Service REST API endpoints. The service runs on port 8000 and exposes its API under the /api/v1 prefix. All endpoints require JWT authentication and are tenant-scoped.

Authentication

All requests require a JWT bearer token:

Authorization: Bearer <jwt-token>

Models API

Register Model

POST /api/v1/models
Content-Type: application/json

Request:

{
    "name": "customer_churn_predictor",
    "description": "Predicts customer churn probability",
    "model_type": "classification",
    "framework": "sklearn",
    "tags": {"team": "customer-analytics", "use_case": "churn"}
}

Response (201):

{
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "name": "customer_churn_predictor",
    "tenant_id": "acme-corp",
    "current_version": "1.0.0",
    "stage": "development",
    "created_at": "2026-02-12T10:00:00Z"
}

List Models

GET /api/v1/models?stage=production&framework=sklearn&limit=20

Get Model

GET /api/v1/models/{model_id}

Transition Model Stage

PUT /api/v1/models/{model_id}/stage

Request:

{
    "version": "1.2.0",
    "target_stage": "production"
}

Predictions API

Single Prediction

POST /api/v1/predictions
Content-Type: application/json

Request:

{
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "version": "1.2.0",
    "features": {
        "total_orders": 15,
        "days_since_last_order": 45,
        "avg_order_value": 125.50,
        "support_tickets": 3
    },
    "options": {
        "return_probabilities": true,
        "return_explanation": false
    }
}

Response (200):

{
    "request_id": "req-abc-123",
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "version": "1.2.0",
    "prediction": [1],
    "probabilities": [[0.25, 0.75]],
    "confidence": 0.75,
    "latency_ms": 12.5
}

Batch Prediction

POST /api/v1/predictions/batch

Request:

{
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "input_path": "s3://data/customers_to_score.parquet",
    "output_path": "s3://predictions/churn_scores/",
    "batch_size": 1000
}

Response (202):

{
    "job_id": "batch-job-456",
    "status": "submitted",
    "estimated_completion": "2026-02-12T11:00:00Z"
}

Training API

Submit Training Job

POST /api/v1/training
Content-Type: application/json

Request:

{
    "model_name": "customer_churn_v2",
    "framework": "sklearn",
    "data_config": {
        "train_path": "s3://data/train.parquet",
        "validation_path": "s3://data/val.parquet",
        "target_column": "churned",
        "batch_size": 64
    },
    "training_config": {
        "strategy": "data_parallel",
        "num_workers": 2,
        "use_gpu": false,
        "epochs": 50,
        "learning_rate": 0.001,
        "early_stopping": true,
        "patience": 5
    },
    "hyperparameters": {
        "n_estimators": 100,
        "max_depth": 10,
        "min_samples_split": 5
    }
}

Response (202):

{
    "job_id": "train-job-789",
    "status": "submitted",
    "model_name": "customer_churn_v2"
}

Get Training Status

GET /api/v1/training/{job_id}

Response (200):

{
    "job_id": "train-job-789",
    "status": "running",
    "progress": {
        "current_epoch": 25,
        "total_epochs": 50,
        "current_loss": 0.342,
        "best_loss": 0.298,
        "learning_rate": 0.0008
    },
    "metrics": {
        "train_accuracy": 0.92,
        "val_accuracy": 0.88,
        "train_loss": 0.342,
        "val_loss": 0.401
    },
    "resources": {
        "workers": 2,
        "gpus": 0,
        "duration_minutes": 45
    }
}

Hyperparameter Tuning API

Start Tuning

POST /api/v1/tuning

Request:

{
    "model_name": "customer_churn_v2",
    "search_space": {
        "n_estimators": {"type": "choice", "values": [50, 100, 200, 500]},
        "max_depth": {"type": "randint", "min": 3, "max": 20},
        "learning_rate": {"type": "loguniform", "min": 0.0001, "max": 0.1},
        "min_samples_split": {"type": "choice", "values": [2, 5, 10]}
    },
    "objective_metric": "val_accuracy",
    "mode": "max",
    "num_trials": 50,
    "algorithm": "bayesian",
    "max_concurrent_trials": 4
}

Ensemble API

Create Ensemble

POST /api/v1/ensembles

Request:

{
    "name": "churn_ensemble",
    "method": "voting_soft",
    "models": [
        {"model_id": "model-1", "weight": 0.4},
        {"model_id": "model-2", "weight": 0.35},
        {"model_id": "model-3", "weight": 0.25}
    ],
    "parallel_inference": true,
    "fail_on_partial": false,
    "min_models_required": 2
}

Ensemble Prediction

POST /api/v1/ensembles/{ensemble_id}/predict

Feature Store API

Get Online Features

POST /api/v1/features/online

Request:

{
    "feature_refs": [
        "customer_features:total_orders",
        "customer_features:avg_order_value",
        "customer_features:days_since_last_order"
    ],
    "entity_rows": [
        {"customer_id": "cust-123"},
        {"customer_id": "cust-456"}
    ]
}

Response (200):

{
    "results": [
        {
            "customer_id": "cust-123",
            "total_orders": 15,
            "avg_order_value": 125.50,
            "days_since_last_order": 12
        },
        {
            "customer_id": "cust-456",
            "total_orders": 42,
            "avg_order_value": 89.99,
            "days_since_last_order": 3
        }
    ]
}

Trigger Materialization

POST /api/v1/features/materialize

Request:

{
    "feature_views": ["customer_features", "order_features"],
    "incremental": true
}

Monitoring API

Get Drift Report

GET /api/v1/monitoring/{model_id}/drift?features=total_orders,avg_order_value

Response (200):

{
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "overall_drift": true,
    "features_checked": 2,
    "features_drifted": 1,
    "results": {
        "total_orders": {
            "drifted": false,
            "statistic": 0.045,
            "p_value": 0.34,
            "method": "ks_2samp"
        },
        "avg_order_value": {
            "drifted": true,
            "statistic": 0.182,
            "p_value": 0.003,
            "method": "ks_2samp"
        }
    }
}

Get Performance Report

GET /api/v1/monitoring/{model_id}/performance?period_hours=24

Deployment API

Deploy Model

POST /api/v1/deployments

Request:

{
    "model_id": "550e8400-e29b-41d4-a716-446655440000",
    "version": "1.2.0",
    "runtime": "ray_serve",
    "num_replicas": 2,
    "autoscaling": {
        "min_replicas": 1,
        "max_replicas": 10,
        "target_requests_per_replica": 5
    },
    "resources": {
        "cpu": 2,
        "memory_gb": 4,
        "gpu": 0
    }
}

Error Responses

Status	Code	Description
400	`INVALID_REQUEST`	Malformed request
401	`UNAUTHORIZED`	Invalid token
404	`MODEL_NOT_FOUND`	Model does not exist
404	`JOB_NOT_FOUND`	Training job does not exist
409	`STAGE_CONFLICT`	Invalid stage transition
422	`VALIDATION_ERROR`	Feature validation failed
429	`RATE_LIMITED`	Too many requests
500	`INTERNAL_ERROR`	Server error
503	`RAY_UNAVAILABLE`	Ray cluster unreachable

Ab Testing Ensemble