MATIH Platform is in active MVP development. Documentation reflects current implementation status.
1. Introduction
Overview

Platform Capabilities

Production - Six capability pillars implemented across 34+ microservices

The MATIH Platform organizes its capabilities into six pillars. Each pillar represents a cohesive set of features that address a specific domain of the data lifecycle. While the pillars can be understood independently, the true power of MATIH lies in their deep integration through the Context Graph, the event-driven architecture, and the unified identity and configuration systems.

This section provides an overview. Each pillar has its own dedicated page with detailed feature descriptions, architecture diagrams, and service mappings.


1.1Capability Pillars

Conversational Analytics
Natural Language QueriesText-to-SQLMulti-Agent OrchestratorWebSocket StreamingRAGFollow-up Context
Business Intelligence
Dashboard DesignWidget SystemSemantic LayerScheduled RefreshPDF ExportEmbedding
Machine Learning
Experiment TrackingModel RegistryDistributed TrainingModel ServingA/B TestingDrift Detection
Data Engineering
Pipeline OrchestrationFederated QueriesStream ProcessingData QualityCDCBatch Processing
Data Governance
Data CatalogLineage TrackingClassificationAccess ControlComplianceQuality Scores
Platform Operations
Tenant ManagementObservabilityConfigurationNotificationsBillingInfrastructure

1.2Pillar Summary

PillarDescriptionKey ServicesPrimary Persona
Conversational AnalyticsNatural language to insights via multi-agent AI orchestrationai-service, query-engineBusiness Analyst, all users
BI and DashboardsDashboard design, visualization, semantic metrics, scheduled reportsbi-service, render-service, semantic-layerBI Developer
ML PlatformExperiment tracking, model training, serving, and monitoringml-service, ai-serviceML Engineer
Data EngineeringPipeline orchestration, federated queries, stream processingpipeline-service, query-engine, catalog-serviceData Engineer
Agent WorkflowsAI agent creation, workflow automation, agent marketplaceai-service, ops-agent-serviceAgentic User
Data GovernanceCatalog, lineage, quality monitoring, compliancecatalog-service, governance-service, data-quality-serviceData Engineer, Platform Admin
Multi-TenancyTenant isolation, provisioning, customization, resource managementtenant-service, infrastructure-service, iam-servicePlatform Admin
ObservabilityMetrics, logging, tracing, alerting, and incident managementobservability-api, ops-agent-serviceOperations Engineer

1.3Cross-Pillar Integration

The pillars do not operate in isolation. The following integration points connect them into a cohesive platform experience:

Context Graph as the Connective Tissue

The Context Graph (powered by Neo4j) maintains a unified knowledge graph of all platform entities and their relationships. It connects:

User --[EXECUTES]--> Query --[READS_FROM]--> Table
Table --[SOURCED_BY]--> Pipeline --[PRODUCES]--> Table
Dashboard --[DISPLAYS]--> Query --[USES_METRIC]--> SemanticMetric
Model --[TRAINED_ON]--> Table --[HAS_QUALITY_SCORE]--> QualityMetric
User --[BELONGS_TO]--> Tenant --[HAS_QUOTA]--> ResourceLimit
Agent --[INVOKES]--> Tool --[QUERIES]--> DataSource

This graph enables cross-pillar capabilities:

Cross-Pillar CapabilityPillars ConnectedMechanism
"What breaks if I drop this column?"Data Engineering + BI + MLGraph traversal from Table through Query, Dashboard, and Model nodes
"Show me data quality for my model inputs"ML + Data GovernanceGraph traversal from Model through TRAINED_ON edges to Table quality scores
"Auto-suggest follow-up questions"Conversational Analytics + Data EngineeringGraph traversal from current Query to related Tables and common query patterns
"Which dashboards use stale data?"BI + Data EngineeringGraph traversal from Pipeline failures through Table to Dashboard nodes
"Alert me when this data changes"Data Governance + Platform OperationsCDC events trigger quality checks, which trigger notification-service alerts

Event Bus as the Integration Backbone

Apache Kafka carries domain events between pillars:

EventProducerConsumersCross-Pillar Effect
query.completedquery-engineai-service, audit-service, observability-apiAnalysis generation, audit logging, metric tracking
pipeline.failedpipeline-servicenotification-service, data-quality-serviceAlert delivery, quality score update
model.deployedml-serviceai-service, audit-service, bi-serviceAgent awareness, compliance logging, dashboard integration
quality.alertdata-quality-servicenotification-service, catalog-serviceMulti-channel alert, catalog annotation
tenant.provisionedtenant-serviceAll data plane servicesService initialization for new tenant

1.4Capability Matrix

The following matrix maps which MATIH services contribute to each major capability:

Capabilityai-servicequery-enginebi-serviceml-servicecatalog-servicepipeline-servicegovernance-service
Natural language queriesPrimaryExecutes----Schema context----
Dashboard creationAssistsExecutesPrimary--Metadata--Access policies
Model trainingOrchestrates----PrimaryFeature metadataData prep--
Data lineage--TracksConsumesConsumesPrimaryContributesEnforces
Data qualitySurfaces--SurfacesConsumesStoresValidatesEnforces
Pipeline orchestrationAssists--------Primary--
Compliance--AuditedAuditedAuditedStoresAuditedPrimary
Agent workflowsPrimary------------

1.5What Is Not in Scope

For clarity, the following capabilities are explicitly outside the scope of the MATIH Platform in its current version:

Out of ScopeRationale
Data warehouse storage engineMATIH queries external storage (S3, ADLS, GCS) via Trino; it does not provide its own columnar storage engine
ETL/ELT transformation DSLMATIH orchestrates pipelines but does not replace dedicated tools like dbt for transformation logic
Foundation model trainingThe AI Engine uses external LLM providers or self-hosted models via vLLM; it does not train foundation models
Spreadsheet functionalityMATIH is not a replacement for Excel or Google Sheets
Application developmentMATIH is a data platform, not a general-purpose application development platform
Real-time OLTP workloadsMATIH is optimized for analytical workloads (OLAP), not transactional processing

Detailed Capability Pages

Explore each capability pillar in detail:

  • Conversational Analytics -- How the multi-agent orchestrator transforms natural language into executed queries, analyses, and visualizations
  • BI and Dashboards -- Dashboard design, the widget system, the semantic layer, and report generation
  • ML Platform -- The full ML lifecycle from experiment tracking through model serving with drift detection
  • Data Engineering -- Pipeline orchestration, federated queries, stream processing, and data quality
  • Agent Workflows -- Creating custom AI agents, workflow automation, and the agent marketplace
  • Data Governance -- Data catalog, lineage, classification, masking, and compliance
  • Multi-Tenancy -- Tenant isolation, provisioning, customization, and resource management
  • Observability -- Metrics, logging, distributed tracing, alerting, and incident management