MATIH Platform is in active MVP development. Documentation reflects current implementation status.
14. Context Graph & Ontology
Semantic Layer

Semantic Layer (WrenAI)

The Semantic Layer is a Java/Spring Boot service (port 8086) that provides business-level abstractions over the physical data model. Integrating with WrenAI, it defines business metrics, dimensions, facts, and relationships that enable the AI Service to generate SQL using business terminology rather than raw table and column names. This section covers the semantic model architecture, WrenAI integration, metric definitions, and query generation.


Architecture

Semantic Layer (Port 8086)
    |
    +-- REST API
    |   +-- Metric definitions
    |   +-- Dimension management
    |   +-- Fact table configuration
    |   +-- Semantic query interface
    |
    +-- WrenAI Integration
    |   +-- MDL (Modeling Definition Language)
    |   +-- Semantic SQL generation
    |   +-- Join path resolution
    |
    +-- Storage
    |   +-- PostgreSQL (metric metadata)
    |   +-- WrenAI Engine (semantic model)
    |
    +-- Integration
        +-- AI Service (metric-aware SQL generation)
        +-- Ontology Service (entity enrichment)
        +-- Data Catalog (table metadata)

Semantic Model

The semantic model provides a business-friendly abstraction:

Metrics

Business metrics define calculated measures:

public class MetricDefinition {
    private String id;
    private String tenantId;
    private String name;              // "Total Revenue"
    private String description;       // "Sum of all order amounts"
    private String expression;        // "SUM(orders.amount)"
    private String factTable;         // "analytics.sales.orders"
    private List<String> dimensions;  // ["region", "product_category", "date"]
    private String timeGranularity;   // "day", "week", "month", "quarter", "year"
    private Map<String, String> filters;  // Default filters
    private String format;            // "currency", "percentage", "number"
    private MetricType type;          // ADDITIVE, SEMI_ADDITIVE, NON_ADDITIVE
}

Dimensions

Dimensions define the axes along which metrics can be analyzed:

public class DimensionDefinition {
    private String id;
    private String tenantId;
    private String name;              // "Region"
    private String description;       // "Geographic sales region"
    private String sourceTable;       // "analytics.geography.regions"
    private String sourceColumn;      // "region_name"
    private DimensionType type;       // CATEGORICAL, TEMPORAL, HIERARCHICAL
    private List<String> hierarchy;   // ["country", "region", "city"]
    private Map<String, String> members;  // Known dimension values
}

Facts

Fact tables define the central event/transaction tables:

public class FactTableDefinition {
    private String id;
    private String tenantId;
    private String name;              // "Orders"
    private String tableName;         // "analytics.sales.orders"
    private String timestampColumn;   // "order_date"
    private List<String> measureColumns;   // ["amount", "quantity", "discount"]
    private List<JoinDefinition> joins;    // Joins to dimension tables
    private String grainDescription;       // "One row per order line item"
}

WrenAI Integration

WrenAI provides the semantic modeling engine:

Modeling Definition Language (MDL)

WrenAI uses MDL to define the semantic model:

{
    "models": [
        {
            "name": "orders",
            "tableReference": {
                "catalog": "analytics",
                "schema": "sales",
                "table": "orders"
            },
            "columns": [
                {
                    "name": "order_id",
                    "type": "integer",
                    "isPrimaryKey": true
                },
                {
                    "name": "amount",
                    "type": "decimal",
                    "isCalculated": false
                },
                {
                    "name": "order_date",
                    "type": "date"
                }
            ],
            "relationships": [
                {
                    "name": "customer",
                    "model": "customers",
                    "joinType": "MANY_TO_ONE",
                    "condition": "orders.customer_id = customers.id"
                }
            ]
        }
    ],
    "metrics": [
        {
            "name": "total_revenue",
            "model": "orders",
            "expression": "SUM(amount)",
            "type": "SIMPLE",
            "timeGrain": "order_date"
        },
        {
            "name": "avg_order_value",
            "model": "orders",
            "expression": "AVG(amount)",
            "type": "SIMPLE"
        }
    ],
    "relationships": [
        {
            "name": "orders_to_customers",
            "models": ["orders", "customers"],
            "joinType": "MANY_TO_ONE",
            "condition": "orders.customer_id = customers.id"
        }
    ]
}

Semantic SQL Generation

WrenAI translates business-level queries into SQL:

Business Query: "Show me total revenue by region for the last quarter"

Semantic Translation:
  Metric: total_revenue (SUM(orders.amount))
  Dimension: region (via orders -> customers -> regions join path)
  Time Filter: last quarter (orders.order_date >= '2025-10-01')

Generated SQL:
  SELECT r.region_name, SUM(o.amount) as total_revenue
  FROM analytics.sales.orders o
  JOIN analytics.crm.customers c ON o.customer_id = c.id
  JOIN analytics.geography.regions r ON c.region_id = r.id
  WHERE o.order_date >= DATE '2025-10-01'
    AND o.order_date < DATE '2026-01-01'
  GROUP BY r.region_name
  ORDER BY total_revenue DESC

Join Path Resolution

The semantic layer automatically resolves join paths between tables:

public class JoinPathResolver {
    /**
     * Find the shortest join path between two tables.
     */
    public List<JoinDefinition> resolve(
        String sourceTable,
        String targetTable,
        SemanticModel model
    ) {
        // Use BFS to find shortest path in the relationship graph
        Queue<List<JoinDefinition>> queue = new LinkedList<>();
        Set<String> visited = new HashSet<>();
 
        // ... BFS traversal ...
 
        return shortestPath;
    }
}

Join Types

Join TypeDescriptionExample
MANY_TO_ONEForeign key relationshipOrders -> Customer
ONE_TO_MANYReverse foreign keyCustomer -> Orders
MANY_TO_MANYVia junction tableProducts to Categories (bidirectional)
ONE_TO_ONEUnique relationshipUser -> Profile

Metric Types

TypeDescriptionAggregation Behavior
ADDITIVECan be summed across all dimensionsRevenue, quantity
SEMI_ADDITIVECan be summed across some dimensionsAccount balance (not across time)
NON_ADDITIVECannot be directly summedAverage price, ratios
DERIVEDCalculated from other metricsProfit margin = revenue - cost

REST API

Metric Operations

POST   /api/v1/semantic/metrics              # Create metric
GET    /api/v1/semantic/metrics              # List metrics
GET    /api/v1/semantic/metrics/{id}          # Get metric
PUT    /api/v1/semantic/metrics/{id}          # Update metric
DELETE /api/v1/semantic/metrics/{id}          # Delete metric

Dimension Operations

POST   /api/v1/semantic/dimensions           # Create dimension
GET    /api/v1/semantic/dimensions           # List dimensions
GET    /api/v1/semantic/dimensions/{id}       # Get dimension

Semantic Query

POST /api/v1/semantic/query
Content-Type: application/json

Request:

{
    "metrics": ["total_revenue", "order_count"],
    "dimensions": ["region", "product_category"],
    "filters": [
        {"dimension": "order_date", "operator": ">=", "value": "2025-10-01"}
    ],
    "sort": [{"metric": "total_revenue", "direction": "DESC"}],
    "limit": 10
}

Response:

{
    "sql": "SELECT r.region_name, p.category, SUM(o.amount) as total_revenue, COUNT(*) as order_count ...",
    "columns": [
        {"name": "region", "type": "string"},
        {"name": "product_category", "type": "string"},
        {"name": "total_revenue", "type": "decimal"},
        {"name": "order_count", "type": "integer"}
    ]
}

Model Management

POST   /api/v1/semantic/models               # Deploy semantic model
GET    /api/v1/semantic/models               # List models
PUT    /api/v1/semantic/models/{id}           # Update model
GET    /api/v1/semantic/models/{id}/validate  # Validate model

AI Service Integration

The AI Service queries the Semantic Layer to enrich SQL generation:

class SemanticLayerClient:
    """Client for Semantic Layer integration."""
 
    async def get_metrics(
        self,
        tenant_id: str,
    ) -> list[dict[str, Any]]:
        """Get available business metrics."""
        response = await self._http.get(
            f"{self._base_url}/api/v1/semantic/metrics",
            headers={"X-Tenant-ID": tenant_id},
        )
        return response.json()
 
    async def resolve_metric(
        self,
        metric_name: str,
        dimensions: list[str],
        filters: list[dict],
        tenant_id: str,
    ) -> str:
        """Resolve a metric query to SQL."""
        response = await self._http.post(
            f"{self._base_url}/api/v1/semantic/query",
            json={
                "metrics": [metric_name],
                "dimensions": dimensions,
                "filters": filters,
            },
            headers={"X-Tenant-ID": tenant_id},
        )
        return response.json()["sql"]

This integration allows the AI Service to understand business terminology like "total revenue" or "customer lifetime value" and generate accurate SQL using the semantic model's metric definitions and join paths.


Multi-Tenancy

Each tenant has an isolated semantic model:

ResourceIsolation
MetricsScoped by tenant_id
DimensionsScoped by tenant_id
Fact tablesScoped by tenant_id
ModelsSeparate WrenAI deployment per tenant
Join pathsTenant-specific relationship definitions