MATIH Platform is in active MVP development. Documentation reflects current implementation status.
15. Workbench Architecture
Ops Workbench
Chat Interface

Chat Interface

The Chat Interface page provides an AI-assisted conversational tool for platform operations. Operators can ask questions about system health, investigate issues, query metrics using natural language, and receive guided troubleshooting assistance powered by the Ops Agent Service.


Capabilities

CapabilityDescriptionExample Query
Health queriesCheck service status and health"Is the AI service healthy?"
Metric queriesQuery Prometheus metrics via natural language"What is the p95 latency for the query engine?"
Log searchSearch logs across services"Show me errors from the AI service in the last hour"
Incident assistanceGet troubleshooting guidance"The AI service is returning 503 errors"
Runbook executionExecute operational runbooks"Run the connection pool diagnostic"
Resource queriesCheck resource utilization"Which pods are using the most memory?"

Chat Architecture

Operator --> Chat Interface --> Ops Agent Service --> Observability API
                                     |                       |
                                     |              +--------+--------+
                                     |              |        |        |
                                     |          Prometheus  Loki    Tempo
                                     |
                                     +--> Kubernetes API
                                     |
                                     +--> AI Service (LLM)

Message Types

User Messages

Plain text natural language queries about platform operations.

Assistant Responses

Structured responses with embedded data:

Response TypeContent
TextNatural language explanation
Metric chartInline time-series visualization
Log excerptFormatted log lines with highlighting
TableTabular data (pod list, service status)
Action suggestionClickable remediation actions
Runbook stepsStep-by-step troubleshooting guide

Conversational Context

The chat maintains session context for multi-turn investigations:

interface OpsSessionContext {
  session_id: string;
  services_discussed: string[];
  metrics_queried: string[];
  time_range: { from: string; to: string };
  active_incident?: string;
}

This allows follow-up questions without repeating context:

TurnMessage
1"What is the AI service error rate?"
2"Show me the logs for those errors"
3"Is the database connection pool full?"
4"Restart the affected pods"

Tool Execution

The Ops Agent has access to operational tools:

ToolActionConfirmation Required
query_prometheusExecute a PromQL queryNo
search_logsSearch Loki logsNo
get_pod_statusList pod statusNo
get_service_healthCheck service healthNo
describe_podGet pod detailsNo
restart_podRestart a specific podYes
scale_deploymentChange replica countYes
run_runbookExecute a runbookYes

Action Confirmation

Destructive or mutating actions require explicit operator confirmation:

interface ActionConfirmation {
  action: string;
  target: string;
  description: string;
  risk_level: 'low' | 'medium' | 'high';
  requires_approval: boolean;
}

Suggested Queries

The chat interface offers contextual suggestions based on current platform state:

ContextSuggestions
Alert firing"Investigate the current alert for [service]"
High latency"What is causing latency in [service]?"
Pod restarts"Why is [pod] restarting?"
General"Show platform health summary"

WebSocket Connection

The chat uses WebSocket for real-time streaming of assistant responses:

const useOpsChat = (sessionId: string) => {
  const ws = useWebSocket(`/ws/ops/chat/${sessionId}`);
 
  const sendMessage = (message: string) => {
    ws.send(JSON.stringify({ type: 'message', content: message }));
  };
 
  return { sendMessage, messages: ws.messages };
};

Configuration

SettingDefaultDescription
Session timeout30 minutesIdle session expiration
Max message length2000 charactersMaximum user message length
Response streamingEnabledStream responses in real-time
Action confirmationEnabledRequire confirmation for mutating actions