Platform Capabilities
The MATIH Platform organizes its capabilities into six pillars. Each pillar represents a cohesive set of features that address a specific domain of the data lifecycle. While the pillars can be understood independently, the true power of MATIH lies in their deep integration through the Context Graph, the event-driven architecture, and the unified identity and configuration systems.
This section provides an overview. Each pillar has its own dedicated page with detailed feature descriptions, architecture diagrams, and service mappings.
1.1Capability Pillars
1.2Pillar Summary
| Pillar | Description | Key Services | Primary Persona |
|---|---|---|---|
| Conversational Analytics | Natural language to insights via multi-agent AI orchestration | ai-service, query-engine | Business Analyst, all users |
| BI and Dashboards | Dashboard design, visualization, semantic metrics, scheduled reports | bi-service, render-service, semantic-layer | BI Developer |
| ML Platform | Experiment tracking, model training, serving, and monitoring | ml-service, ai-service | ML Engineer |
| Data Engineering | Pipeline orchestration, federated queries, stream processing | pipeline-service, query-engine, catalog-service | Data Engineer |
| Agent Workflows | AI agent creation, workflow automation, agent marketplace | ai-service, ops-agent-service | Agentic User |
| Data Governance | Catalog, lineage, quality monitoring, compliance | catalog-service, governance-service, data-quality-service | Data Engineer, Platform Admin |
| Multi-Tenancy | Tenant isolation, provisioning, customization, resource management | tenant-service, infrastructure-service, iam-service | Platform Admin |
| Observability | Metrics, logging, tracing, alerting, and incident management | observability-api, ops-agent-service | Operations Engineer |
1.3Cross-Pillar Integration
The pillars do not operate in isolation. The following integration points connect them into a cohesive platform experience:
Context Graph as the Connective Tissue
The Context Graph (powered by Neo4j) maintains a unified knowledge graph of all platform entities and their relationships. It connects:
User --[EXECUTES]--> Query --[READS_FROM]--> Table
Table --[SOURCED_BY]--> Pipeline --[PRODUCES]--> Table
Dashboard --[DISPLAYS]--> Query --[USES_METRIC]--> SemanticMetric
Model --[TRAINED_ON]--> Table --[HAS_QUALITY_SCORE]--> QualityMetric
User --[BELONGS_TO]--> Tenant --[HAS_QUOTA]--> ResourceLimit
Agent --[INVOKES]--> Tool --[QUERIES]--> DataSourceThis graph enables cross-pillar capabilities:
| Cross-Pillar Capability | Pillars Connected | Mechanism |
|---|---|---|
| "What breaks if I drop this column?" | Data Engineering + BI + ML | Graph traversal from Table through Query, Dashboard, and Model nodes |
| "Show me data quality for my model inputs" | ML + Data Governance | Graph traversal from Model through TRAINED_ON edges to Table quality scores |
| "Auto-suggest follow-up questions" | Conversational Analytics + Data Engineering | Graph traversal from current Query to related Tables and common query patterns |
| "Which dashboards use stale data?" | BI + Data Engineering | Graph traversal from Pipeline failures through Table to Dashboard nodes |
| "Alert me when this data changes" | Data Governance + Platform Operations | CDC events trigger quality checks, which trigger notification-service alerts |
Event Bus as the Integration Backbone
Apache Kafka carries domain events between pillars:
| Event | Producer | Consumers | Cross-Pillar Effect |
|---|---|---|---|
query.completed | query-engine | ai-service, audit-service, observability-api | Analysis generation, audit logging, metric tracking |
pipeline.failed | pipeline-service | notification-service, data-quality-service | Alert delivery, quality score update |
model.deployed | ml-service | ai-service, audit-service, bi-service | Agent awareness, compliance logging, dashboard integration |
quality.alert | data-quality-service | notification-service, catalog-service | Multi-channel alert, catalog annotation |
tenant.provisioned | tenant-service | All data plane services | Service initialization for new tenant |
1.4Capability Matrix
The following matrix maps which MATIH services contribute to each major capability:
| Capability | ai-service | query-engine | bi-service | ml-service | catalog-service | pipeline-service | governance-service |
|---|---|---|---|---|---|---|---|
| Natural language queries | Primary | Executes | -- | -- | Schema context | -- | -- |
| Dashboard creation | Assists | Executes | Primary | -- | Metadata | -- | Access policies |
| Model training | Orchestrates | -- | -- | Primary | Feature metadata | Data prep | -- |
| Data lineage | -- | Tracks | Consumes | Consumes | Primary | Contributes | Enforces |
| Data quality | Surfaces | -- | Surfaces | Consumes | Stores | Validates | Enforces |
| Pipeline orchestration | Assists | -- | -- | -- | -- | Primary | -- |
| Compliance | -- | Audited | Audited | Audited | Stores | Audited | Primary |
| Agent workflows | Primary | -- | -- | -- | -- | -- | -- |
1.5What Is Not in Scope
For clarity, the following capabilities are explicitly outside the scope of the MATIH Platform in its current version:
| Out of Scope | Rationale |
|---|---|
| Data warehouse storage engine | MATIH queries external storage (S3, ADLS, GCS) via Trino; it does not provide its own columnar storage engine |
| ETL/ELT transformation DSL | MATIH orchestrates pipelines but does not replace dedicated tools like dbt for transformation logic |
| Foundation model training | The AI Engine uses external LLM providers or self-hosted models via vLLM; it does not train foundation models |
| Spreadsheet functionality | MATIH is not a replacement for Excel or Google Sheets |
| Application development | MATIH is a data platform, not a general-purpose application development platform |
| Real-time OLTP workloads | MATIH is optimized for analytical workloads (OLAP), not transactional processing |
Detailed Capability Pages
Explore each capability pillar in detail:
- Conversational Analytics -- How the multi-agent orchestrator transforms natural language into executed queries, analyses, and visualizations
- BI and Dashboards -- Dashboard design, the widget system, the semantic layer, and report generation
- ML Platform -- The full ML lifecycle from experiment tracking through model serving with drift detection
- Data Engineering -- Pipeline orchestration, federated queries, stream processing, and data quality
- Agent Workflows -- Creating custom AI agents, workflow automation, and the agent marketplace
- Data Governance -- Data catalog, lineage, classification, masking, and compliance
- Multi-Tenancy -- Tenant isolation, provisioning, customization, and resource management
- Observability -- Metrics, logging, distributed tracing, alerting, and incident management