MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
Integrations
dbt Integration

dbt Integration

The AI Service integrates with dbt (data build tool) to import semantic model definitions, metric specifications, and materialization metadata. This integration enriches the Text-to-SQL pipeline with business context from dbt projects, enabling the AI to understand metric definitions, relationships between models, and data freshness.


Integration Architecture

The dbt integration operates through a synchronization pipeline that reads dbt artifacts and loads them into the AI Service schema context:

dbt Project --> manifest.json + catalog.json --> dbt Sync Service --> Schema Context --> RAG Pipeline

Imported Artifacts

ArtifactSource FileUsage in AI Service
Modelsmanifest.jsonTable descriptions, column metadata
Metricsmanifest.json (semantic layer)Business metric definitions
Sourcesmanifest.jsonRaw data source mappings
Testsmanifest.jsonData quality expectations
Documentationcatalog.jsonColumn descriptions, business glossary
Freshnesssources.ymlData staleness tracking

Semantic Model Import

dbt semantic layer metrics are imported as first-class objects in the AI Service:

{
  "metric_name": "monthly_recurring_revenue",
  "description": "Sum of recurring revenue for active subscriptions",
  "type": "derived",
  "sql": "SUM(CASE WHEN status = 'active' THEN mrr ELSE 0 END)",
  "dimensions": ["plan_type", "region", "customer_segment"],
  "time_grains": ["day", "week", "month", "quarter"],
  "filters": [
    {"field": "status", "operator": "=", "value": "active"}
  ]
}

Synchronization

The sync process runs on a configurable schedule or can be triggered manually:

Automatic Sync

# Periodic sync every 15 minutes
MATIH_DBT_SYNC_INTERVAL=900
MATIH_DBT_MANIFEST_PATH=/data/dbt/target/manifest.json
MATIH_DBT_CATALOG_PATH=/data/dbt/target/catalog.json

Manual Sync

POST /api/v1/integrations/dbt/sync
{
  "manifest_url": "https://storage.example.com/dbt/manifest.json",
  "catalog_url": "https://storage.example.com/dbt/catalog.json",
  "tenant_id": "acme-corp"
}

Schema Enrichment

Imported dbt metadata enriches the schema context used by the SQL generator:

EnrichmentSourceImpact
Table descriptionsdbt model descriptionsImproves semantic matching in RAG
Column descriptionsdbt column docsMore accurate column selection
Metric definitionsdbt metricsEnables metric-aware SQL generation
Relationshipsdbt refs and relationshipsBetter JOIN inference
Data typesdbt catalogType-aware aggregation functions

Configuration

Environment VariableDefaultDescription
DBT_INTEGRATION_ENABLEDtrueEnable dbt integration
DBT_MANIFEST_PATH/data/dbt/target/manifest.jsonPath to manifest
DBT_CATALOG_PATH/data/dbt/target/catalog.jsonPath to catalog
DBT_SYNC_INTERVAL900Sync interval in seconds
DBT_CLOUD_API_TOKENnonedbt Cloud API token (optional)
DBT_CLOUD_ACCOUNT_IDnonedbt Cloud account ID (optional)

dbt Cloud Integration

For teams using dbt Cloud, the AI Service can pull artifacts directly via the dbt Cloud API:

GET https://cloud.getdbt.com/api/v2/accounts/:account_id/runs/:run_id/artifacts/manifest.json

This eliminates the need for local file system access and ensures the AI Service always uses the latest production manifest.