AI Service Architecture
The AI Service is the conversational analytics engine and the largest single service in the platform. It implements a multi-agent orchestrator using a LangGraph-style state machine that coordinates specialized agents for intent classification, SQL generation, result analysis, and visualization recommendation. The service spans 65+ Python modules organized across agents, BI analytics, context graphs, LLM management, and storage layers.
2.4.B.1Module Architecture
Key Module Areas
| Module Area | Files | Purpose |
|---|---|---|
agents/ | Orchestrator, router, SQL, analysis, viz, docs, approval, drift detection, hallucination | Multi-agent state machine and specialized agents |
bi/ | Dashboard routes, service, repository, orchestrator | BI analytics with AI-powered insights |
context_graph/ | Dgraph storage, semantic features, thinking service, pattern/feedback/trace stores | Knowledge graph for contextual intelligence |
llm/ | Router, cache, context intelligence, infrastructure, performance, validation | LLM provider abstraction and optimization |
session/ | PostgreSQL-backed session store | Conversation state management |
config/ | Settings, tenant configs | Service configuration |
storage/ | Database pool, connection management | PostgreSQL connection management |
prompt_ab_testing/ | Repository, experiment framework | A/B testing for prompt optimization |
2.4.B.2Agent Orchestrator
The orchestrator coordinates five specialized agents through a state machine:
# From orchestrator.py - Agent state machine
class AgentState(Enum):
START = "start"
ROUTE = "route"
GENERATE_SQL = "generate_sql"
EXECUTE_QUERY = "execute_query"
ANALYZE = "analyze"
VISUALIZE = "visualize"
DOCUMENTATION = "documentation"
APPROVAL = "approval"
RESPOND = "respond"
ERROR = "error"
END = "end"State Machine Flow
START
|
v
ROUTE (RouterAgent classifies intent)
|
+---> SQL_QUERY ------> GENERATE_SQL (SQLAgent)
| |
| [approval needed?]
| yes --> APPROVAL (ApprovalAgent)
| no --> EXECUTE_QUERY (via query-engine)
| |
| v
| ANALYZE (AnalysisAgent)
| |
| v
| VISUALIZE (VizAgent)
| |
| v
| RESPOND
|
+---> ANALYSIS ---------> ANALYZE (AnalysisAgent)
| |
| v
| RESPOND
|
+---> DOCUMENTATION ----> DOCS (DocumentationAgent)
| |
| v
| RESPOND
|
v
END (session updated, events published)Agent Responsibilities
| Agent | Input | Output | LLM Usage |
|---|---|---|---|
RouterAgent | User message + conversation history | Intent classification (SQL, analysis, docs) | Classification prompt |
SQLAgent | User question + schema context + semantic model | SQL query + explanation | Text-to-SQL prompt with RAG |
AnalysisAgent | Query results + user question | Statistical insights, trends, anomalies | Analytical reasoning prompt |
VisualizationAgent | Data shape + analysis results | Chart type + configuration JSON | Visualization recommendation |
DocumentationAgent | User question + schema metadata | Documentation response | Schema explanation prompt |
ApprovalAgent | Generated SQL + governance policies | Approval/rejection decision | Policy evaluation |
2.4.B.3Session Management
Conversation sessions are stored in PostgreSQL (with Redis caching for active sessions):
@dataclass
class ConversationSession:
session_id: str
tenant_id: str
user_id: str | None
state: AgentState # Current state machine position
messages: list[Message] # Conversation history
context: dict # Accumulated context (schemas, results)
created_at: datetime
last_active: datetime
message_count: int
metadata: dict # LLM tokens consumed, latenciesSession storage hierarchy:
- Redis -- Active session cache (TTL: 2 hours from last activity)
- PostgreSQL -- Persistent session storage (retained for analytics)
- In-memory fallback -- Used when both Redis and PostgreSQL are unavailable
2.4.B.4LLM Provider Abstraction
The LLMRouter abstracts multiple LLM providers behind a unified interface:
| Provider | Models | Use Case |
|---|---|---|
| OpenAI | GPT-4o, GPT-4-turbo, GPT-3.5-turbo | Primary provider |
| Azure OpenAI | Same models via Azure endpoints | Enterprise compliance |
| vLLM | Self-hosted open-source models | On-premises deployment |
| Anthropic | Claude 3.5 Sonnet | Alternative provider |
The router selects the provider based on:
- Tenant configuration (preferred provider)
- Task type (SQL generation may use a different model than analysis)
- Cost optimization (route simple tasks to cheaper models)
- Fallback chain (if primary is unavailable, try secondary)
Prompt A/B Testing
The prompt_ab_testing module enables controlled experiments on prompt variations:
# Experiment configuration
experiment = PromptExperiment(
name="sql_generation_v3",
variants=[
PromptVariant("control", weight=0.5, template="prompts/sql_v2.txt"),
PromptVariant("treatment", weight=0.5, template="prompts/sql_v3.txt"),
],
metrics=["sql_accuracy", "execution_success", "user_satisfaction"],
tenant_filter=None, # All tenants
)2.4.B.5Context Graph Integration
The AI service integrates with Dgraph for contextual intelligence:
| Feature | Storage | Purpose |
|---|---|---|
| Schema context | Dgraph | Graph of tables, columns, relationships for RAG |
| Query patterns | PostgreSQL | Successful query patterns for few-shot learning |
| User feedback | PostgreSQL | Thumbs up/down on AI responses |
| Thinking traces | PostgreSQL | Agent reasoning chains for debugging |
| Semantic embeddings | Qdrant | Vector embeddings of schema and queries |
The context graph enables the AI service to understand not just the schema structure, but the semantic relationships between data entities, improving text-to-SQL accuracy.
2.4.B.6Key APIs
| Endpoint | Method | Description |
|---|---|---|
/api/v1/ai/chat | POST | Send a conversational message |
/api/v1/ai/chat/stream | POST | Streaming chat response (SSE) |
/api/v1/ai/sessions | GET | List active sessions |
/api/v1/ai/sessions/{id} | GET/DELETE | Session management |
/api/v1/ai/feedback | POST | Submit response feedback |
/api/v1/ai/models | GET | Available LLM models |
/api/v1/ai/quality/metrics | GET | Quality metrics dashboard |
/api/v1/bi/dashboards | GET/POST | AI-powered dashboard management |
Related Sections
- Agent Flow -- End-to-end agent orchestration flow
- Query Architecture -- Query execution details
- Vector Stores -- Qdrant embeddings
- AI Service Deep Dive -- Complete AI service documentation