MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
LLM Infrastructure
LLM Infrastructure Overview

LLM Infrastructure Overview

Production - Multi-provider router, response cache, context management, validation, MCP, performance monitoring

The LLM Infrastructure provides a comprehensive layer for managing Large Language Model interactions across the AI Service. It includes a multi-provider router with tenant-level configuration, response caching with cost tracking, context window optimization, input/output validation, Model Context Protocol (MCP) integration, and performance monitoring.


12.6.1LLM Architecture

API Layer
Router APICompletion APICache APIContext APIMCP APIGraphQL API
Services
LLMRouterResponseCacheServiceContextOptimizerValidationServiceMCPServicePerformanceService
Providers
OpenAIAnthropicAzure OpenAIVertex AI (Gemini)AWS BedrockvLLM (Self-hosted)
Storage
CacheStore (Redis)UsageRecordsValidationHistoryPerformanceMetrics

Provider Configuration

# From src/config/settings.py
llm_provider: str = "openai"           # Default provider
openai_api_key: str = ""               # OpenAI key
openai_model: str = "gpt-4o"           # Default model
anthropic_api_key: str = ""            # Anthropic key
anthropic_model: str = "claude-opus-4-5-20250120"
azure_openai_endpoint: str = ""        # Azure endpoint
vllm_base_url: str = "http://vllm.matih-data-plane.svc.cluster.local:8000/v1"
 
# Extended thinking support
extended_thinking_enabled: bool = False
extended_thinking_budget_tokens: int = 10000

12.6.2Section Pages

PageDescription
LLM RouterMulti-provider routing with 5 strategies
Response CacheTenant-scoped caching with cost tracking
Context ManagementContext window optimization
Context IntelligenceIntelligent context analysis and truncation
ValidationInput/output validation and safety filters
Model Context ProtocolMCP integration for external tools
PerformanceLatency tracking, throughput monitoring
ProvidersProvider-specific configuration