Model Selection

Automated model selection evaluates multiple algorithm families to find the best-performing model for a given task and dataset.

Algorithm Types

Type	Algorithms	Best For
`linear`	Logistic Regression, Linear SVM, Ridge	Linearly separable data, baseline models
`tree`	Decision Trees, Random Forest	Interpretable models, tabular data
`ensemble`	XGBoost, LightGBM, CatBoost	High performance on tabular data
`neural`	MLP, CNN, RNN	Complex patterns, large datasets
`svm`	SVM with RBF/Polynomial kernels	Small-medium datasets

Selection Process

The AutoML orchestrator trains each algorithm type and compares performance:

for algo_config in job.algorithm_configs:
    runs = await self._train_algorithm(job, algo_config)
    for run in runs:
        score = run.metrics.get("accuracy", run.metrics.get("r2", 0))
        if score > best_score:
            best_score = score
            best_run = run

Source Files

File	Path
Model Selection	`data-plane/ml-service/src/automl/model_selection.py`
AutoML Orchestrator	`data-plane/ml-service/src/automl/automl_orchestrator.py`

AutoML Overview Feature Engineering