Agent Workflows

Beta - ai-service, ops-agent-service -- Custom agent creation, workflow automation

The Agent Workflows pillar of the MATIH Platform extends the conversational analytics capability beyond ad-hoc queries into structured, reusable AI agent workflows. Users can create custom agents that combine data queries, ML model predictions, external API calls, and human approval steps into automated workflows that execute on schedule or in response to events.

1.1What Are Agent Workflows?

An agent workflow is a multi-step process orchestrated by MATIH's LangGraph-based AI engine. Unlike a simple query-response interaction, a workflow:

Persists across sessions -- Workflows run independently of user sessions; they can execute on schedule or in response to events
Combines multiple capabilities -- A single workflow can query data, run ML predictions, generate visualizations, and send notifications
Includes human-in-the-loop steps -- Sensitive operations (model deployments, data modifications) can require human approval before proceeding
Maintains state -- Each workflow execution tracks its state through all steps, enabling pause, resume, and rollback

Workflow vs. Conversation

Aspect	Conversational Query	Agent Workflow
Trigger	User types a question	Schedule, event, or manual trigger
Duration	Seconds to minutes	Minutes to hours
Persistence	Session-scoped (24h TTL)	Permanent until deleted
Complexity	Single query + analysis	Multi-step with branching and loops
Approval	Not required	Configurable approval gates
Notification	Streamed via WebSocket	Email, Slack, Teams, webhook
Reusability	One-off (can be saved as template)	Designed for repeated execution

1.2Agent Types

MATIH provides several pre-built agent types that can be composed into workflows:

Agent Type	Description	Example Use Case
SQL Agent	Generates and executes SQL queries with full RAG context	"Query monthly revenue by product category"
Analysis Agent	Performs statistical analysis on query results	"Compute year-over-year growth rate and flag anomalies"
Visualization Agent	Generates chart specifications from data	"Create a stacked bar chart of results"
ML Prediction Agent	Invokes deployed ML models for inference	"Run churn prediction on this customer segment"
Data Quality Agent	Checks data quality scores and triggers alerts	"Verify that data freshness meets SLA before proceeding"
Notification Agent	Sends messages via email, Slack, Teams, or webhook	"Send daily summary to the sales team"
Approval Agent	Pauses workflow and requests human approval	"Get manager approval before deploying model to production"
HTTP Agent	Calls external APIs and processes responses	"Fetch exchange rates from external API and enrich data"
Ops Agent	Executes operational tasks (scaling, config changes)	"Scale up Trino workers before heavy batch processing"

1.3Workflow Definition

Workflows are defined as directed acyclic graphs (DAGs) of agent steps. Each step specifies:

Workflow: Daily Revenue Report
  |
  Step 1: SQL Agent
  |  Query: "Revenue by product category for yesterday"
  |  Output: query_results
  |
  Step 2: Analysis Agent
  |  Input: query_results
  |  Task: "Compare to same day last week; highlight anomalies > 2 std dev"
  |  Output: analysis_results
  |
  Step 3: Visualization Agent
  |  Input: query_results, analysis_results
  |  Task: "Bar chart with anomaly annotations"
  |  Output: chart_spec
  |
  Step 4: Conditional Branch
  |  If: any anomaly detected in analysis_results
  |  Then: Step 5 (Alert)
  |  Else: Step 6 (Report)
  |
  Step 5: Notification Agent
  |  Channel: Slack #revenue-alerts
  |  Message: "Revenue anomaly detected: {analysis_results.summary}"
  |  Attachment: chart_spec rendered as PNG
  |
  Step 6: Notification Agent
  |  Channel: Email (daily-report@company.com)
  |  Message: "Daily revenue report attached"
  |  Attachment: Full report with chart and analysis
  |
  Schedule: Daily at 08:00 UTC

Workflow Creation Methods

Workflows can be created through three methods:

Method	Audience	Description
Conversational	All users	Describe the workflow in natural language; the AI generates the workflow definition
Visual builder	BI developers, data engineers	Drag-and-drop workflow builder in the Agentic Workbench
YAML definition	Developers	Define workflows as YAML files, version-controlled in git

1.4Workflow Execution

Trigger Types

Trigger	Description	Example
Schedule	Cron-based execution	"Run every Monday at 9:00 AM"
Event	Kafka event triggers execution	"Run when data quality alert fires for orders table"
Manual	User-initiated from Agentic Workbench	"Run this workflow now"
API	External system triggers via REST API	"Trigger from CI/CD pipeline after data load completes"
Webhook	Incoming HTTP request triggers execution	"Trigger when Slack slash command received"

Execution Engine

Workflows are executed by the ai-service using LangGraph with Temporal as the durable execution backend:

Workflow received -- Trigger event received by ai-service
Temporal workflow started -- Durable workflow initiated with full state tracking
Steps execute sequentially -- Each agent step runs as a Temporal activity with configurable timeout and retry
State checkpointed -- After each step, state is persisted to Temporal's database
Approval gates -- If a step requires approval, workflow pauses and notification is sent
Completion -- Final step executes; results stored and notification sent

Fault Tolerance

Failure Scenario	Recovery Mechanism
Agent step fails	Automatic retry with exponential backoff (configurable: 1, 3, 5 retries)
Service unavailable	Temporal workflow pauses and resumes when service recovers
Approval timeout	Configurable timeout with default action (approve, reject, or escalate)
Data quality check fails	Workflow branches to error handling path
Infrastructure failure	Temporal replays workflow from last checkpoint

1.5Agent Marketplace

Planned - Designed for future release -- agent sharing and discovery

The Agent Marketplace will enable organizations to share and discover pre-built agent workflows:

Marketplace Feature	Description
Template library	Curated collection of workflow templates for common business processes
Custom agents	Organizations can publish custom agents for internal use
Version management	Agents versioned with semantic versioning; consumers can pin to specific versions
Rating and reviews	Users rate and review agents based on effectiveness and reliability
Usage analytics	Track which agents are most used, their success rates, and performance metrics

Pre-Built Workflow Templates

Template	Description	Agents Used
Daily KPI Report	Automated daily report with anomaly detection and distribution	SQL, Analysis, Visualization, Notification
Data Quality Monitor	Continuous quality monitoring with alerting and escalation	Data Quality, Notification, Approval
Model Performance Review	Weekly model performance report with drift analysis	ML Prediction, Analysis, Visualization, Notification
Customer Churn Alert	Real-time churn risk scoring with proactive outreach triggers	SQL, ML Prediction, Notification
Cost Anomaly Detection	Cloud cost monitoring with automated scaling recommendations	SQL, Analysis, Ops, Approval, Notification
Compliance Audit	Automated compliance check with report generation	SQL, Data Quality, Analysis, Notification

1.6The Ops Agent Service

The ops-agent-service (Python, port 8080) provides operational automation capabilities:

Capability	Description
Infrastructure scaling	Automated scaling of platform components based on workload patterns
Incident response	Automated initial response to common incidents (restart services, clear caches, adjust limits)
Capacity planning	Usage trend analysis with scaling recommendations
Cost optimization	Identify underutilized resources and recommend right-sizing
Runbook automation	Convert operational runbooks into executable agent workflows

The Ops Agent is designed to work alongside human operators, not replace them. It handles routine operational tasks automatically and escalates complex situations to human operators with context and recommendations.

Deep Dive References

AI Service Agents -- Agent architecture, state management, and custom agent development
AI Service Integrations -- Kafka integration, webhook handling, and external API connectivity
Ops Agent -- Operational automation agent architecture and runbook format

Data Engineering Data Governance