MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
NLP Processing

NLP Processing

Production - Entity extraction, intent detection, ambiguity resolution, normalization

The NLP Processing layer parses natural language questions to extract structured metadata that guides SQL generation. It identifies the query intent, extracts entities (tables, columns, values), detects time references, and resolves ambiguities.


12.3.5.1QuestionParser

class QuestionParser:
    def parse(self, question: str) -> ParsedQuestion:
        """Parse a natural language question into structured form."""
        return ParsedQuestion(
            intent=self._detect_intent(question),
            entities=self._extract_entities(question),
            confidence=self._calculate_confidence(question),
            time_references=self._extract_time_refs(question),
            aggregations=self._extract_aggregations(question),
            filters=self._extract_filters(question),
        )

Intent Types

IntentDescriptionExample
selectSimple data retrieval"Show me all customers"
aggregateAggregation query"What is total revenue?"
compareComparison query"Compare Q3 vs Q4 sales"
trendTime-series analysis"Show revenue trend for 2024"
filterFiltered query"Customers in the EMEA region"
joinMulti-table query"Orders with customer details"
rankRanking query"Top 10 products by sales"

Entity Extraction

The parser identifies several entity types:

  • Table references: "orders", "customers", "sales"
  • Column references: "revenue", "order date", "customer name"
  • Value literals: "EMEA", "2024", "$1M"
  • Time references: "last quarter", "this year", "past 30 days"
  • Aggregations: "total", "average", "count", "maximum"