Text-to-SQL Pipeline Overview
Production - Full NL-to-SQL pipeline with validation, auto-correction, and benchmarking
The Text-to-SQL pipeline converts natural language questions into executable SQL queries. It combines NLP-based question parsing, vector-based schema retrieval, LLM-powered SQL generation, and multi-layer validation with auto-correction to deliver accurate, safe SQL from conversational input.
12.3.1Pipeline Architecture
Input Processing
QuestionParserEntity ExtractionIntent DetectionAmbiguity Resolution
Context Retrieval
QdrantSchemaStoreTable MatchingColumn DiscoverySample Query Lookup
SQL Generation
SQLGeneratorEnhancedSQLGeneratorQueryDecomposerDialectConverter
Validation & Safety
SQLValidator (sqlglot)Security ChecksSchema ValidationAuto-Correction
Execution
Query Engine ClientResult FormattingCachingExplanation Generation
Pipeline Flow
Natural Language Question
|
v
Step 1: QuestionParser.parse()
| -> ParsedQuestion (intent, entities, confidence, time_refs, aggregations)
v
Step 2: QdrantSchemaStore.retrieve_context()
| -> RetrievedContext (tables, columns, relationships, semantic_mappings, sample_queries)
v
Step 3: LLMClient.generate_sql()
| -> Raw SQL string
v
Step 4: SQLValidator.validate()
| -> ValidationResult (is_valid, errors, warnings, normalized_sql)
|
+-- If invalid and auto_correct:
| LLMClient.correct_sql() -> Retry validation (up to max_sql_retries)
v
Step 5: GeneratedQuery
| -> (sql, confidence, tables_used, explanation, status)
v
GenerationResult(success, query, validation, retries)12.3.2SQLGenerator
The core generator class in data-plane/ai-service/src/sql_generation/generator.py:
class SQLGenerator:
def __init__(self):
self.settings = get_settings()
self.question_parser = QuestionParser()
self.vector_store = QdrantSchemaStore()
self.llm_client = LLMClient()
self.validator = SQLValidator()
async def generate(
self,
question: str,
tenant_id: str,
catalog: str | None = None,
dialect: str = "trino",
validate: bool = True,
auto_correct: bool = True,
) -> GenerationResult:
# Step 1: Parse question
parsed_question = self.question_parser.parse(question)
# Step 2: Retrieve schema context
context = await self.vector_store.retrieve_context(
tenant_id=tenant_id,
question=question,
top_k=self.settings.context_top_k,
)
# Step 3: Generate SQL via LLM
sql = await self._generate_sql(question, parsed_question, context, dialect)
# Step 4: Validate and auto-correct
validation_result = None
retries = 0
if validate:
for attempt in range(self.settings.max_sql_retries + 1):
validation_result = self.validator.validate(sql, dialect, ...)
if validation_result.is_valid:
break
if auto_correct and attempt < self.settings.max_sql_retries:
sql = await self.llm_client.correct_sql(sql=sql, error=error_msg, ...)
retries += 1
# Step 5: Build result
return GenerationResult(success=True, query=generated_query, ...)12.3.3Section Pages
| Page | Description |
|---|---|
| Schema Extraction | Database schema discovery, column metadata enrichment, vector embedding |
| SQL Generation | NL-to-SQL conversion, prompt engineering, dialect support |
| SQL Validation | Validation with sqlglot/sqlparse, security checks, auto-correction |
| Query Execution | SQL execution via Query Engine, result formatting |
| NLP Processing | Entity extraction, intent detection, ambiguity resolution |
| Autocomplete | Query suggestions, similar queries, templates |
| Benchmarking | Spider benchmark, accuracy metrics, regression testing |
| Conversational SQL | Multi-turn SQL refinement, follow-up queries |
12.3.4Configuration
# SQL Generation settings from src/config/settings.py
max_sql_retries: int = 3 # Auto-correction attempts
sql_validation_enabled: bool = True # Enable validation
explain_query_enabled: bool = True # Generate explanations
max_context_tokens: int = 8000 # Context window limit
max_schema_tables: int = 50 # Max tables in context
max_sample_rows: int = 5 # Sample rows per tableConfidence Scoring
The confidence score is calculated from four weighted components:
def _calculate_confidence(self, parsed_question, context, validation):
confidence = parsed_question.confidence * 0.3 # NLP parsing confidence
if context.relevance_scores:
confidence += avg_relevance * 0.3 # Schema relevance
if context.sample_queries:
confidence += top_score * 0.2 # Similar query match
if validation and validation.is_valid:
confidence += 0.2 # Validation boost
return min(confidence, 1.0)