AutoML Pipeline Orchestration
The AutoMLOrchestrator manages the end-to-end AutoML pipeline from job submission through model selection and final evaluation.
Orchestrator Configuration
orchestrator = AutoMLOrchestrator(max_concurrent_jobs=3)The orchestrator maintains a job queue and limits concurrent executions to manage resource usage.
Job Execution Flow
async def _execute_job(self, job_id):
# Step 1: Data preparation (10% progress)
await self._prepare_data(job)
# Step 2: Train each algorithm (10-70% progress)
for algo_config in job.algorithm_configs:
runs = await self._train_algorithm(job, algo_config)
# Step 3: Hyperparameter tuning on best algorithm (70-90%)
tuned_run = await self._tune_hyperparameters(job, best_run)
# Step 4: Final evaluation and model saving (90-100%)
model = await self._save_model(job, best_run)
job.best_model_id = model.idEvent Callbacks
Register callbacks for job lifecycle events:
orchestrator.on("job_started", lambda job: notify_user(job))
orchestrator.on("job_completed", lambda job: register_model(job))
orchestrator.on("job_failed", lambda job: alert_team(job))Source Files
| File | Path |
|---|---|
| AutoML Orchestrator | data-plane/ml-service/src/automl/automl_orchestrator.py |
| Enhanced AutoML Service | data-plane/ml-service/src/automl/enhanced_automl_service.py |