Agent Workflows
The Agent Workflows pillar of the MATIH Platform extends the conversational analytics capability beyond ad-hoc queries into structured, reusable AI agent workflows. Users can create custom agents that combine data queries, ML model predictions, external API calls, and human approval steps into automated workflows that execute on schedule or in response to events.
1.1What Are Agent Workflows?
An agent workflow is a multi-step process orchestrated by MATIH's LangGraph-based AI engine. Unlike a simple query-response interaction, a workflow:
- Persists across sessions -- Workflows run independently of user sessions; they can execute on schedule or in response to events
- Combines multiple capabilities -- A single workflow can query data, run ML predictions, generate visualizations, and send notifications
- Includes human-in-the-loop steps -- Sensitive operations (model deployments, data modifications) can require human approval before proceeding
- Maintains state -- Each workflow execution tracks its state through all steps, enabling pause, resume, and rollback
Workflow vs. Conversation
| Aspect | Conversational Query | Agent Workflow |
|---|---|---|
| Trigger | User types a question | Schedule, event, or manual trigger |
| Duration | Seconds to minutes | Minutes to hours |
| Persistence | Session-scoped (24h TTL) | Permanent until deleted |
| Complexity | Single query + analysis | Multi-step with branching and loops |
| Approval | Not required | Configurable approval gates |
| Notification | Streamed via WebSocket | Email, Slack, Teams, webhook |
| Reusability | One-off (can be saved as template) | Designed for repeated execution |
1.2Agent Types
MATIH provides several pre-built agent types that can be composed into workflows:
| Agent Type | Description | Example Use Case |
|---|---|---|
| SQL Agent | Generates and executes SQL queries with full RAG context | "Query monthly revenue by product category" |
| Analysis Agent | Performs statistical analysis on query results | "Compute year-over-year growth rate and flag anomalies" |
| Visualization Agent | Generates chart specifications from data | "Create a stacked bar chart of results" |
| ML Prediction Agent | Invokes deployed ML models for inference | "Run churn prediction on this customer segment" |
| Data Quality Agent | Checks data quality scores and triggers alerts | "Verify that data freshness meets SLA before proceeding" |
| Notification Agent | Sends messages via email, Slack, Teams, or webhook | "Send daily summary to the sales team" |
| Approval Agent | Pauses workflow and requests human approval | "Get manager approval before deploying model to production" |
| HTTP Agent | Calls external APIs and processes responses | "Fetch exchange rates from external API and enrich data" |
| Ops Agent | Executes operational tasks (scaling, config changes) | "Scale up Trino workers before heavy batch processing" |
1.3Workflow Definition
Workflows are defined as directed acyclic graphs (DAGs) of agent steps. Each step specifies:
Workflow: Daily Revenue Report
|
Step 1: SQL Agent
| Query: "Revenue by product category for yesterday"
| Output: query_results
|
Step 2: Analysis Agent
| Input: query_results
| Task: "Compare to same day last week; highlight anomalies > 2 std dev"
| Output: analysis_results
|
Step 3: Visualization Agent
| Input: query_results, analysis_results
| Task: "Bar chart with anomaly annotations"
| Output: chart_spec
|
Step 4: Conditional Branch
| If: any anomaly detected in analysis_results
| Then: Step 5 (Alert)
| Else: Step 6 (Report)
|
Step 5: Notification Agent
| Channel: Slack #revenue-alerts
| Message: "Revenue anomaly detected: {analysis_results.summary}"
| Attachment: chart_spec rendered as PNG
|
Step 6: Notification Agent
| Channel: Email (daily-report@company.com)
| Message: "Daily revenue report attached"
| Attachment: Full report with chart and analysis
|
Schedule: Daily at 08:00 UTCWorkflow Creation Methods
Workflows can be created through three methods:
| Method | Audience | Description |
|---|---|---|
| Conversational | All users | Describe the workflow in natural language; the AI generates the workflow definition |
| Visual builder | BI developers, data engineers | Drag-and-drop workflow builder in the Agentic Workbench |
| YAML definition | Developers | Define workflows as YAML files, version-controlled in git |
1.4Workflow Execution
Trigger Types
| Trigger | Description | Example |
|---|---|---|
| Schedule | Cron-based execution | "Run every Monday at 9:00 AM" |
| Event | Kafka event triggers execution | "Run when data quality alert fires for orders table" |
| Manual | User-initiated from Agentic Workbench | "Run this workflow now" |
| API | External system triggers via REST API | "Trigger from CI/CD pipeline after data load completes" |
| Webhook | Incoming HTTP request triggers execution | "Trigger when Slack slash command received" |
Execution Engine
Workflows are executed by the ai-service using LangGraph with Temporal as the durable execution backend:
- Workflow received -- Trigger event received by ai-service
- Temporal workflow started -- Durable workflow initiated with full state tracking
- Steps execute sequentially -- Each agent step runs as a Temporal activity with configurable timeout and retry
- State checkpointed -- After each step, state is persisted to Temporal's database
- Approval gates -- If a step requires approval, workflow pauses and notification is sent
- Completion -- Final step executes; results stored and notification sent
Fault Tolerance
| Failure Scenario | Recovery Mechanism |
|---|---|
| Agent step fails | Automatic retry with exponential backoff (configurable: 1, 3, 5 retries) |
| Service unavailable | Temporal workflow pauses and resumes when service recovers |
| Approval timeout | Configurable timeout with default action (approve, reject, or escalate) |
| Data quality check fails | Workflow branches to error handling path |
| Infrastructure failure | Temporal replays workflow from last checkpoint |
1.5Agent Marketplace
The Agent Marketplace will enable organizations to share and discover pre-built agent workflows:
| Marketplace Feature | Description |
|---|---|
| Template library | Curated collection of workflow templates for common business processes |
| Custom agents | Organizations can publish custom agents for internal use |
| Version management | Agents versioned with semantic versioning; consumers can pin to specific versions |
| Rating and reviews | Users rate and review agents based on effectiveness and reliability |
| Usage analytics | Track which agents are most used, their success rates, and performance metrics |
Pre-Built Workflow Templates
| Template | Description | Agents Used |
|---|---|---|
| Daily KPI Report | Automated daily report with anomaly detection and distribution | SQL, Analysis, Visualization, Notification |
| Data Quality Monitor | Continuous quality monitoring with alerting and escalation | Data Quality, Notification, Approval |
| Model Performance Review | Weekly model performance report with drift analysis | ML Prediction, Analysis, Visualization, Notification |
| Customer Churn Alert | Real-time churn risk scoring with proactive outreach triggers | SQL, ML Prediction, Notification |
| Cost Anomaly Detection | Cloud cost monitoring with automated scaling recommendations | SQL, Analysis, Ops, Approval, Notification |
| Compliance Audit | Automated compliance check with report generation | SQL, Data Quality, Analysis, Notification |
1.6The Ops Agent Service
The ops-agent-service (Python, port 8080) provides operational automation capabilities:
| Capability | Description |
|---|---|
| Infrastructure scaling | Automated scaling of platform components based on workload patterns |
| Incident response | Automated initial response to common incidents (restart services, clear caches, adjust limits) |
| Capacity planning | Usage trend analysis with scaling recommendations |
| Cost optimization | Identify underutilized resources and recommend right-sizing |
| Runbook automation | Convert operational runbooks into executable agent workflows |
The Ops Agent is designed to work alongside human operators, not replace them. It handles routine operational tasks automatically and escalates complex situations to human operators with context and recommendations.
Deep Dive References
- AI Service Agents -- Agent architecture, state management, and custom agent development
- AI Service Integrations -- Kafka integration, webhook handling, and external API connectivity
- Ops Agent -- Operational automation agent architecture and runbook format