MATIH Platform is in active MVP development. Documentation reflects current implementation status.
2. Architecture
Data Stores
Vector Stores

Vector Stores

Vector stores power the Retrieval-Augmented Generation (RAG) pipeline in the MATIH Platform. They store vector embeddings of schema metadata, SQL query examples, business terminology, and documentation, enabling AI services to retrieve relevant context when generating SQL and answering questions.

All vector operations are centralized through the embeddings-service (Java Spring Boot, port 8213), which provides a REST API abstraction over Qdrant with per-user RBAC and tenant isolation. Python AI services connect via EmbeddingsClient (commons-python), not directly to Qdrant.


Architecture

Python AI Services                    Java Service                     Vector DB
─────────────────                    ─────────────                    ──────────
context-graph-service ──┐
ai-service ─────────────┤  httpx     ┌──────────────────┐  REST      ┌────────┐
ml-service ─────────────┼──────────→ │ embeddings-service│ ────────→ │ Qdrant │
copilot-service ────────┤  (via      │ (Java, port 8213) │            │ :6333  │
search-service ─────────┘  commons)  │                    │            └────────┘
                                     │ • @PreAuthorize    │  JDBC      ┌────────┐
                                     │ • tenant isolation │ ────────→ │ PgSQL  │
                                     │ • audit logging    │            │metadata│
                                     │ • Kafka events     │            └────────┘
                                     └──────────────────┘

Vector Store Options

TechnologyUse CaseDeployment
Qdrant (via embeddings-service)Production vector searchKubernetes (Helm chart)
LanceDBDevelopment and testingEmbedded (no server)

Qdrant

Qdrant is the production vector database, accessed exclusively through the embeddings-service:

AspectDetails
Index typeHNSW (Hierarchical Navigable Small World)
Distance metricCosine similarity (configurable per collection)
FilteringPayload-based filtering with mandatory tenant ID
APIREST and gRPC (accessed via embeddings-service)
Multi-tenancyMandatory tenant_id filter injected by embeddings-service on every query
RBACembeddings:read, embeddings:write, embeddings:delete, embeddings:admin
AuditAll operations logged via AuditLogger + Kafka events

Embedding Sources

The RAG pipeline indexes the following content as vector embeddings:

SourceIndexed ContentUpdate Frequency
Catalog metadataTable names, column names, descriptions, data typesOn schema change
Query examplesSuccessful SQL queries with their natural language questionsAfter each successful query
Business termsOntology definitions, term relationshipsOn ontology update
Semantic modelMetric definitions, dimension descriptionsOn model publish
DocumentationPlatform and data documentationOn documentation update

RAG Query Flow

User Question: "What was revenue last quarter?"
  |
  v
Embedding Model: Convert question to vector
  |
  v
Qdrant: Search for similar vectors
  | Filter: tenant_id = "acme-corp"
  | Top-K: 5 most similar results
  |
  v
Retrieved Context:
  - Table: orders (columns: amount, order_date, customer_id)
  - Similar query: "SELECT SUM(amount) FROM orders WHERE ..."
  - Metric: revenue = SUM(orders.amount)
  |
  v
SQLAgent: Generate SQL using retrieved context

Collection Structure

CollectionContentEmbedding Dimension
schema_metadataTable and column descriptions1536
query_examplesQuestion-SQL pairs1536
business_termsOntology definitions1536
semantic_modelsMetric definitions1536

Each vector entry includes a payload with tenant ID, creation timestamp, and source metadata.


LanceDB (Development)

LanceDB provides an embedded vector store for development:

AspectDetails
DeploymentEmbedded in AI Service process
StorageLocal filesystem
IndexIVF-PQ for approximate search
Multi-tenancySeparate tables per tenant

LanceDB requires no additional infrastructure, making it suitable for local development and testing.


Embeddings Service REST API

The embeddings-service exposes these endpoints (all require JWT authentication):

EndpointMethodPermissionDescription
/api/v1/collectionsPOSTembeddings:writeCreate collection
/api/v1/collectionsGETembeddings:readList tenant collections
/api/v1/collections/{name}DELETEembeddings:deleteDelete collection
/api/v1/vectors/upsertPOSTembeddings:writeBatch upsert vectors
/api/v1/vectors/searchPOSTembeddings:readSimilarity search
/api/v1/vectors/fetchPOSTembeddings:readFetch by IDs
/api/v1/vectorsDELETEembeddings:deleteDelete vectors
/api/v1/admin/statsGETembeddings:adminCluster statistics

Python Client

from matih_commons.clients.embeddings_client import EmbeddingsClient
 
client = EmbeddingsClient()
await client.create_collection(tenant_id, "my-coll", vector_size=768)
await client.upsert(tenant_id, "my-coll", vectors=[...])
results = await client.search(tenant_id, "my-coll", vector=[0.1, ...], top_k=10)

Related Pages