ML Integration Overview
The ML Integration module bridges the AI Service with the ML Service, enabling conversational interfaces for machine learning workflows. Through this integration, users can train models, tune hyperparameters, engineer features, serve predictions, manage the model registry, track experiments, and perform exploratory data analysis -- all through natural language or the ML API endpoints.
Integration Architecture
The ML Integration operates as a feature-flagged module within the AI Service, controlled by MODULE_ML_ENABLED:
BI Workbench / ML Workbench
|
AI Service (ML Module)
|
+----+----+----+----+----+----+
| | | | | | |
Train Tune Feat Serve Reg Track EDA
| | | | | | |
+----+----+----+----+----+----+
|
ML Service (Ray AIR)Module Components
| Component | Description | Source |
|---|---|---|
| Model Training | Submit and monitor training jobs | src/ml/training/ |
| Hyperparameter Tuning | Configure and launch HPO sweeps | src/ml/tuning/ |
| Feature Engineering | Feature set creation and management | src/ml/features/ |
| Model Serving | Deploy models for real-time inference | src/ml/serving/ |
| Model Registry | Version, stage, and catalog models | src/ml/registry/ |
| Experiment Tracking | Track runs, metrics, and artifacts | src/ml/tracking/ |
| Exploratory Data Analysis | Statistical profiling and visualization | src/ml/eda/ |
Communication Pattern
The AI Service communicates with the ML Service over HTTP REST:
class MLServiceClient:
def __init__(self, base_url: str):
self.base_url = base_url # http://ml-service:8000
async def submit_training_job(self, config: TrainingConfig) -> TrainingJob:
response = await self._post("/api/v1/training/submit", config.dict())
return TrainingJob(**response)
async def get_prediction(self, model_id: str, features: dict) -> Prediction:
response = await self._post(f"/api/v1/serving/{model_id}/predict", features)
return Prediction(**response)Conversational ML
Users can interact with ML workflows through natural language in the chat interface:
| User Query | ML Action | Agent Involved |
|---|---|---|
| "Train a model to predict customer churn" | Submit training job | ML Training Agent |
| "Tune the learning rate for my churn model" | Launch HPO sweep | ML Tuning Agent |
| "What features are most important for churn?" | Feature importance analysis | ML Analysis Agent |
| "Deploy the best churn model to production" | Model deployment | ML Serving Agent |
| "How accurate is my deployed model?" | Performance metrics query | ML Monitoring Agent |
| "Profile the customer dataset" | EDA execution | ML EDA Agent |
Configuration
| Environment Variable | Default | Description |
|---|---|---|
MODULE_ML_ENABLED | true | Enable ML Integration module |
ML_SERVICE_URL | http://ml-service:8000 | ML Service base URL |
ML_SERVICE_TIMEOUT | 30 | Request timeout in seconds |
ML_MAX_TRAINING_JOBS | 5 | Max concurrent training jobs per tenant |
Detailed Sections
| Section | Content |
|---|---|
| Model Training | Training job submission, monitoring, and results |
| Hyperparameter Tuning | Search strategies, parameter spaces, scheduling |
| Feature Engineering | Feature sets, transformations, and feature store |
| Model Serving | Deployment, inference, and scaling |
| Model Registry | Versioning, staging, and lifecycle |
| Experiment Tracking | Runs, metrics, comparisons, and artifacts |
| Exploratory Data Analysis | Statistical profiling, distributions, and correlations |