MATIH Platform is in active MVP development. Documentation reflects current implementation status.
1. Introduction
Overview
Part I: Foundation

Chapter 1: Introduction to MATIH

A comprehensive introduction to the MATIH Enterprise Platform -- its vision, the problems it solves, the capabilities it delivers, the personas it serves, and the technology stack that powers it. This chapter establishes the conceptual foundation for every subsequent chapter in this documentation.

Learning Objectives

  • Understand the Intent to Insights paradigm and why it matters for modern data organizations
  • Identify the five systemic problems in the modern data stack that MATIH addresses
  • Navigate the six capability pillars: conversational analytics, data engineering, BI, ML, governance, and operations
  • Recognize the six primary personas and their journey maps across platform workbenches
  • Map the technology stack across backend, frontend, data, ML, and infrastructure layers
  • Use the terminology glossary as a reference throughout the rest of this documentation

Details

Estimated Read Time: 3-4 hours (full chapter)
Prerequisites:
  • No prior knowledge of the MATIH Platform required
  • Familiarity with cloud computing concepts is helpful but not required
Related Chapters:
  • Ch. 2: Architecture Deep Dive
  • Ch. 3: Security and Multi-Tenancy
  • Ch. 4: Installation Guide
  • Ch. 5: Quick Start
Production - Platform GA -- 34+ microservices, 55+ Helm charts, 8 frontend workbenches
MATIH LABSMATIH LABSFrom Intent to Insight — Cloud-Agnostic Data/ML/AI/BI PlatformFRONTEND WORKBENCHES — React / TypeScript / ViteBIWorkbenchMLWorkbenchDataWorkbenchAgenticWorkbenchControl PlaneWorkbenchData PlaneWorkbenchOpsWorkbenchOnboardingWorkbenchCONTROL PLANE — 10 Java Spring Boot 3.2 MicroservicesIAMServiceTenantServiceConfigServiceNotificationServiceAuditServiceBillingServiceObservability APIServiceInfrastructureServiceAPI GatewayServiceRegistryServiceDATA PLANE — 13 Polyglot Microservices (Java / Python / Node.js)AI ServiceLangGraph AgentsQuery EngineTrino FederationPipeline ServiceAirflow + dbtML ServiceMLflow + RayBI ServiceCatalogSemantic LayerData QualityGovernanceOntologyRenderOps AgentKafkaRedisElasticNeo4jQdrantChromaDgraphDATA & ML INFRASTRUCTURE — Processing, Serving & Feature EnginesTrinoSparkFlinkAirflowdbtClickHouseTemporalOpenMetaMLflowRayvLLMTritonFeastJupyterGrowthBookEvidentlyCLOUDAzureAWSGCPINFRASTRUCTUREKubernetesDockerHelmTerraformNGINXcert-managerOBSERVABILITYPrometheusGrafanaLokiTempoOpenTelemetryDATA STORESPostgreSQLMySQLMongoDBSnowflakeBigQueryMinIO / S3ClickHouseAI / MLOpenAIAnthropicLangChainLangGraphNVIDIA (vLLM)PyTorchRayMLflowCloud-Agnostic · Kubernetes-Native · Conversational AI · Multi-Tenant · Enterprise-Ready34+ microservices · 55+ Helm charts · 8 workbenches · 80+ technologies · matih.ai
MATIH Platform Architecture — animated overview showing all platform layers, 34+ services, and 80+ technology integrations

Welcome to the MATIH Enterprise Platform documentation. This chapter provides a thorough introduction to the platform, from its founding vision to the detailed technology choices that bring that vision to life. Whether you are a business analyst evaluating the platform, a data engineer preparing to integrate it, or a platform administrator planning a deployment, this chapter gives you the context you need to proceed confidently.


1.1What is MATIH?

MATIH is a cloud-agnostic, Kubernetes-native platform that unifies four traditionally siloed disciplines -- Data Engineering, Machine Learning, Artificial Intelligence, and Business Intelligence -- into a single system with a conversational interface at its core.

The platform's founding premise is captured in three words: Intent to Insights. A user expresses an intent (a question about their data, a request for an analysis, a hypothesis to test), and the platform transforms that intent into insights (validated, visualized, and contextualized answers) through a fully automated pipeline of AI agents, query engines, and visualization services.

MATIH is not a single application. It is a platform of platforms, composed of:

DimensionDetails
Control Plane10 Java/Spring Boot 3.2 microservices managing identity, tenants, configuration, billing, auditing, and observability
Data Plane14 polyglot microservices (Java, Python, Node.js) executing data queries, AI orchestration, ML workflows, pipeline management, and BI operations
Frontend8 purpose-built React/TypeScript/Vite workbench applications, each designed for a specific persona
Infrastructure55+ Helm charts, multi-cloud Terraform modules (Azure, AWS, GCP), fully Kubernetes-native
AI EngineLangGraph multi-agent orchestrator with text-to-SQL generation, RAG, and WebSocket streaming
ObservabilityFull-stack monitoring with Prometheus, Grafana, Loki, Tempo, and OpenTelemetry
Frontend Workbenches
BI WorkbenchML WorkbenchData WorkbenchAgentic WorkbenchControl Plane UIData Plane UIOps WorkbenchOnboarding UI
Control Plane (Java Spring Boot 3.2)
IAM ServiceTenant ServiceConfig ServiceNotification ServiceAudit ServiceBilling ServiceObservability APIInfrastructure ServiceAPI GatewayPlatform Registry
Data Plane (Polyglot)
AI ServiceQuery EngineBI ServiceML ServiceCatalog ServicePipeline ServiceSemantic LayerData Quality ServiceRender ServiceGovernance ServiceOntology ServiceOps Agent ServiceData Plane Agent
Data & ML Infrastructure
PostgreSQL 16Redis 7Kafka (Strimzi)Neo4j 5.xQdrantElasticsearch 8MongoDB 7TrinoSparkFlinkAirflowRayvLLMMLflowFeast
Cloud Infrastructure
KubernetesHelmTerraformcert-managerNGINX IngressExternal Secrets OperatorPrometheusGrafanaLokiTempo

1.2Chapter Structure

This chapter is organized into nine sections. Each section can be read independently, but reading them in order provides the most complete understanding.

SectionDescriptionBest For
Vision and MissionThe founding principles, the conversational paradigm, design philosophy, and strategic roadmapEveryone -- start here
Problem SpaceThe five systemic problems in the modern data stack and the value MATIH deliversDecision makers, architects, evaluators
Platform CapabilitiesDeep dive into each of the six capability pillars with technical detailsAll technical roles
User PersonasSix detailed persona profiles with day-in-life workflows and journey mapsProduct managers, UX designers, all users
Technology StackEvery technology in the platform, organized by layer, with decision rationaleDevelopers, platform engineers, architects
Key TerminologyComprehensive glossary of 100+ terms organized by domainEveryone -- reference throughout
Architecture PreviewHigh-level architecture teaser connecting this chapter to the deep dive in Chapter 2Architects, senior engineers
Getting Started GuideRole-based reading guide: what to read next based on who you areEveryone -- read last

1.3The MATIH Difference

Why Another Data Platform?

The modern data stack has given organizations excellent individual tools: Airflow for orchestration, dbt for transformations, Trino for federated queries, Metabase for BI, MLflow for model tracking. MATIH does not aim to replace these tools. It aims to unify the experience around them and add a conversational AI layer that makes the entire stack accessible to every user in the organization.

The three structural differentiators are:

1. Conversation as the Primary Interface

Most platforms add a chatbot as a secondary feature on top of a traditional UI. MATIH inverts this: the conversational AI interface is the primary interaction model. Every capability in the platform -- from data ingestion to model deployment -- is designed to be accessible through natural language. The workbench UIs exist as power-user tools for fine-grained control, but the conversational path is always the default.

2. Self-Hosted and Cloud-Agnostic

MATIH runs on any Kubernetes cluster, on any cloud provider, or on bare metal. The platform uses no proprietary cloud services in its core architecture. Organizations maintain full control over their data and infrastructure. Cloud-specific integrations (Azure Blob Storage, AWS S3, GCP Cloud Storage) are implemented as pluggable adapters behind clean interfaces.

3. True Multi-Tenancy at Every Layer

Multi-tenancy is not an afterthought. It is woven into every service, every database schema, every API endpoint, and every infrastructure component. Each tenant gets its own Kubernetes namespace, database schemas, DNS zone, TLS certificate, resource quotas, and NGINX ingress controller.


1.4How This Chapter Connects

This introductory chapter establishes the foundation for everything that follows:

  • Chapter 2: Architecture builds directly on the design principles and separation of concerns introduced here, expanding them into detailed service interaction diagrams, data flow patterns, and deployment topologies
  • Chapter 3: Security implements the multi-tenancy and access control concepts described in the capabilities section, with JWT token structures, RBAC models, and network policies
  • Chapters 4-5: Installation and Quick Start deploy the technology stack documented here, using the Helm charts and Terraform modules described in the technology stack section
  • Chapters 6-14: Service Deep Dives provide implementation-level detail for each service referenced in the capabilities and technology sections
  • Chapters 15-20: Frontend, Kubernetes, CI/CD, and Observability cover the operational aspects previewed in this chapter

1.5Prerequisites

No prior knowledge of the MATIH Platform is required for this chapter. The content is written to be accessible to anyone with a general understanding of software systems. The following background will enhance your understanding but is not strictly necessary:

TopicWhy It HelpsWhere It Matters
Cloud computing basicsUnderstanding deployment models and infrastructure automationTechnology Stack, Architecture Preview
SQL fundamentalsAppreciating the text-to-SQL capabilityConversational Analytics, Data Engineering
Kubernetes conceptsUnderstanding the deployment and isolation modelMulti-Tenancy, Observability
Machine learning workflowFollowing the ML lifecycle capabilitiesML Platform, ML Engineer persona
Microservices architectureUnderstanding the service decompositionAll sections

1.6Reading This Documentation

Conventions

Throughout this documentation, the following conventions are used:

ConventionMeaning
monospace textCode, commands, file paths, configuration keys, and service names
Bold textTerms being defined, emphasis, or important notes
Italic textFirst use of a term, or emphasis in narrative context
service-name (port)A MATIH service with its default port number
{tenant_id}A placeholder for a tenant-specific value
Tables with "Planned" markersFeatures that are designed but not yet implemented in the current release

Component Indicators

Each page in this documentation includes an Implementation Status indicator:

Production - Feature is live and stable
Beta - Feature is available but may change
Partially Implemented - Feature is partially implemented
Planned - Feature is designed but not yet built

Cross-References

Links to other sections of this documentation follow the pattern [Chapter Title](/chapter-path). External links are clearly marked. Code references point to actual files in the MATIH codebase.


Let us begin with the Vision and Mission that drives the MATIH Platform.