MATIH Platform is in active MVP development. Documentation reflects current implementation status.
15. Workbench Architecture
Ops Workbench
Overview

Ops Workbench Overview

The Ops Workbench is a Next.js/React application designed for platform operators and SRE teams to monitor, troubleshoot, and manage the MATIH platform. It provides an operations dashboard, observability tools, incident management, and a conversational chat interface for AI-assisted operations.


Application Structure

The Ops Workbench is located at frontend/ops-workbench/ and uses Next.js for server-side rendering:

DirectoryPurpose
src/pages/Page components for each operational area
src/components/Reusable operations UI components
src/hooks/Custom hooks for data fetching and state
src/services/API client services
src/stores/Zustand state stores
src/types/TypeScript type definitions
src/utils/Utility functions

Pages

PageComponentRouteDescription
DashboardDashboardPage/ops/dashboardOperations overview with key metrics
ObservabilityObservabilityPage/ops/observabilityHealth monitoring, logs, traces
IncidentsIncidentsPage/ops/incidentsIncident tracking and management
ChatChatPage/ops/chatAI-assisted operations chat
AlertsAlertsPage/ops/alertsAlert management and configuration
DeploymentsDeploymentsPage/ops/deploymentsDeployment history and rollbacks
InfrastructureInfrastructurePage/ops/infrastructureCluster resource monitoring
ReliabilityReliabilityPage/ops/reliabilitySLO tracking and error budgets
CostCostPage/ops/costInfrastructure cost analysis

Technology Stack

TechnologyVersionPurpose
Next.js14.xReact framework with SSR
React18.xUI framework
TypeScript5.xType-safe development
Tailwind CSS3.xUtility-first styling
Zustand4.xState management
TanStack Query5.xServer state management
Recharts2.xChart visualizations

Data Sources

The Ops Workbench aggregates data from multiple observability backends:

SourceDataProtocol
PrometheusMetricsPromQL via Observability API
Grafana LokiLogsLogQL via Observability API
Grafana TempoTracesTraceQL via Observability API
Kubernetes APIPod/node statusREST via Infrastructure Service
Ops Agent ServiceAI operationsREST + WebSocket

Development

cd frontend/ops-workbench
npm install
npm run dev  # Starts on development port

Detailed Sections

SectionContent
Operations DashboardKey metrics, service health, alerts summary
Observability and HealthLogs, metrics, traces, health checks
Incident ManagementIncident lifecycle, postmortems, runbooks
Chat InterfaceAI-assisted operations conversation