MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
Text-to-SQL Overview

Text-to-SQL Pipeline Overview

Production - Full NL-to-SQL pipeline with validation, auto-correction, and benchmarking

The Text-to-SQL pipeline converts natural language questions into executable SQL queries. It combines NLP-based question parsing, vector-based schema retrieval, LLM-powered SQL generation, and multi-layer validation with auto-correction to deliver accurate, safe SQL from conversational input.


12.3.1Pipeline Architecture

Input Processing
QuestionParserEntity ExtractionIntent DetectionAmbiguity Resolution
Context Retrieval
QdrantSchemaStoreTable MatchingColumn DiscoverySample Query Lookup
SQL Generation
SQLGeneratorEnhancedSQLGeneratorQueryDecomposerDialectConverter
Validation & Safety
SQLValidator (sqlglot)Security ChecksSchema ValidationAuto-Correction
Execution
Query Engine ClientResult FormattingCachingExplanation Generation

Pipeline Flow

Natural Language Question
    |
    v
Step 1: QuestionParser.parse()
    |   -> ParsedQuestion (intent, entities, confidence, time_refs, aggregations)
    v
Step 2: QdrantSchemaStore.retrieve_context()
    |   -> RetrievedContext (tables, columns, relationships, semantic_mappings, sample_queries)
    v
Step 3: LLMClient.generate_sql()
    |   -> Raw SQL string
    v
Step 4: SQLValidator.validate()
    |   -> ValidationResult (is_valid, errors, warnings, normalized_sql)
    |
    +-- If invalid and auto_correct:
    |     LLMClient.correct_sql() -> Retry validation (up to max_sql_retries)
    v
Step 5: GeneratedQuery
    |   -> (sql, confidence, tables_used, explanation, status)
    v
GenerationResult(success, query, validation, retries)

12.3.2SQLGenerator

The core generator class in data-plane/ai-service/src/sql_generation/generator.py:

class SQLGenerator:
    def __init__(self):
        self.settings = get_settings()
        self.question_parser = QuestionParser()
        self.vector_store = QdrantSchemaStore()
        self.llm_client = LLMClient()
        self.validator = SQLValidator()
 
    async def generate(
        self,
        question: str,
        tenant_id: str,
        catalog: str | None = None,
        dialect: str = "trino",
        validate: bool = True,
        auto_correct: bool = True,
    ) -> GenerationResult:
        # Step 1: Parse question
        parsed_question = self.question_parser.parse(question)
 
        # Step 2: Retrieve schema context
        context = await self.vector_store.retrieve_context(
            tenant_id=tenant_id,
            question=question,
            top_k=self.settings.context_top_k,
        )
 
        # Step 3: Generate SQL via LLM
        sql = await self._generate_sql(question, parsed_question, context, dialect)
 
        # Step 4: Validate and auto-correct
        validation_result = None
        retries = 0
        if validate:
            for attempt in range(self.settings.max_sql_retries + 1):
                validation_result = self.validator.validate(sql, dialect, ...)
                if validation_result.is_valid:
                    break
                if auto_correct and attempt < self.settings.max_sql_retries:
                    sql = await self.llm_client.correct_sql(sql=sql, error=error_msg, ...)
                    retries += 1
 
        # Step 5: Build result
        return GenerationResult(success=True, query=generated_query, ...)

12.3.3Section Pages

PageDescription
Schema ExtractionDatabase schema discovery, column metadata enrichment, vector embedding
SQL GenerationNL-to-SQL conversion, prompt engineering, dialect support
SQL ValidationValidation with sqlglot/sqlparse, security checks, auto-correction
Query ExecutionSQL execution via Query Engine, result formatting
NLP ProcessingEntity extraction, intent detection, ambiguity resolution
AutocompleteQuery suggestions, similar queries, templates
BenchmarkingSpider benchmark, accuracy metrics, regression testing
Conversational SQLMulti-turn SQL refinement, follow-up queries

12.3.4Configuration

# SQL Generation settings from src/config/settings.py
max_sql_retries: int = 3              # Auto-correction attempts
sql_validation_enabled: bool = True   # Enable validation
explain_query_enabled: bool = True    # Generate explanations
max_context_tokens: int = 8000        # Context window limit
max_schema_tables: int = 50           # Max tables in context
max_sample_rows: int = 5              # Sample rows per table

Confidence Scoring

The confidence score is calculated from four weighted components:

def _calculate_confidence(self, parsed_question, context, validation):
    confidence = parsed_question.confidence * 0.3     # NLP parsing confidence
    if context.relevance_scores:
        confidence += avg_relevance * 0.3             # Schema relevance
    if context.sample_queries:
        confidence += top_score * 0.2                 # Similar query match
    if validation and validation.is_valid:
        confidence += 0.2                             # Validation boost
    return min(confidence, 1.0)