Hyperparameter Tuning

The Hyperparameter Tuning integration provides automated search over model hyperparameter spaces using Ray Tune. Users can define parameter ranges, select search strategies, and configure early stopping criteria through the AI Service conversational interface or REST API.

Tuning Workflow

Define space: Specify parameter ranges and distributions
Select strategy: Choose a search algorithm (grid, random, Bayesian, etc.)
Submit sweep: Launch the tuning job via the ML Service
Monitor: Track trial progress and intermediate metrics
Select best: Retrieve the best configuration and retrain

Search Strategies

Strategy	Description	Best For
Grid Search	Exhaustive search over all combinations	Small parameter spaces
Random Search	Random sampling from distributions	Large parameter spaces
Bayesian (Optuna)	Sequential model-based optimization	Moderate spaces with expensive trials
HyperBand	Adaptive resource allocation with early stopping	Many trials with limited budget
ASHA	Asynchronous successive halving	Distributed tuning at scale
PBT	Population-based training	Neural network training

Parameter Space Definition

{
  "model_name": "churn-predictor",
  "search_strategy": "bayesian",
  "max_trials": 50,
  "metric": "f1_score",
  "mode": "max",
  "parameter_space": {
    "n_estimators": {"type": "choice", "values": [50, 100, 200, 500]},
    "max_depth": {"type": "randint", "lower": 3, "upper": 12},
    "learning_rate": {"type": "loguniform", "lower": 0.001, "upper": 0.3},
    "subsample": {"type": "uniform", "lower": 0.6, "upper": 1.0},
    "colsample_bytree": {"type": "uniform", "lower": 0.6, "upper": 1.0}
  },
  "early_stopping": {
    "patience": 10,
    "min_delta": 0.001
  }
}

Tuning API

Start Tuning Sweep

POST /api/v1/ml/tune

Get Sweep Status

GET /api/v1/ml/tune/:sweep_id

Get Best Trial

GET /api/v1/ml/tune/:sweep_id/best

List All Trials

GET /api/v1/ml/tune/:sweep_id/trials

Tuning Results

{
  "sweep_id": "sweep-abc123",
  "status": "completed",
  "total_trials": 50,
  "completed_trials": 48,
  "stopped_early": 2,
  "best_trial": {
    "trial_id": "trial-017",
    "parameters": {
      "n_estimators": 200,
      "max_depth": 8,
      "learning_rate": 0.05,
      "subsample": 0.85,
      "colsample_bytree": 0.78
    },
    "metrics": {
      "f1_score": 0.912,
      "accuracy": 0.95,
      "auc_roc": 0.97
    }
  },
  "duration_seconds": 3600
}

Scheduler Configuration

Scheduler	Description
FIFO	Runs trials in order (no early stopping)
MedianStopping	Stops trials below median performance
HyperBand	Brackets of successive halving
ASHA	Asynchronous version of HyperBand

Resource Allocation

Tuning jobs run trials in parallel on Ray, with resources allocated per trial:

Setting	Default	Description
`max_concurrent_trials`	4	Parallel trials per sweep
`cpu_per_trial`	2	CPU cores per trial
`memory_per_trial`	4 GB	Memory per trial
`gpu_per_trial`	0	GPUs per trial
`max_duration_hours`	2	Maximum sweep duration

Model Training Feature Engineering