MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
LLM Router Endpoints

LLM Router Endpoints

The LLM Router API provides endpoints for managing multi-provider LLM routing, viewing routing decisions, configuring provider priorities, and monitoring provider health. The router wraps the core LLMRouter with persistent storage, analytics, and alert generation via the MultiLLMRouterService.


Route Request

Sends a completion request through the LLM router, which selects the optimal provider based on the active routing strategy.

PropertyValue
MethodPOST
Path/api/v1/llm/route
AuthJWT required

Request Body

{
  "messages": [
    {"role": "system", "content": "You are a data analyst."},
    {"role": "user", "content": "Explain this SQL query..."}
  ],
  "strategy": "cost_optimized",
  "max_tokens": 2048,
  "temperature": 0.7,
  "tenant_id": "acme-corp"
}

Response

{
  "content": "This SQL query performs...",
  "provider": "openai",
  "model": "gpt-4",
  "tokens_used": {
    "prompt": 150,
    "completion": 420,
    "total": 570
  },
  "latency_ms": 1230,
  "cost_estimate_usd": 0.0285,
  "routing_decision_id": "rd-abc123"
}

Get Router Configuration

Returns the current LLM router configuration including provider priorities and strategies.

PropertyValue
MethodGET
Path/api/v1/llm/config
AuthJWT required (admin)

Response

{
  "default_strategy": "cost_optimized",
  "providers": [
    {
      "name": "openai",
      "enabled": true,
      "priority": 1,
      "models": ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo"],
      "rate_limit_rpm": 500,
      "max_tokens": 8192
    },
    {
      "name": "anthropic",
      "enabled": true,
      "priority": 2,
      "models": ["claude-3-opus", "claude-3-sonnet"],
      "rate_limit_rpm": 300,
      "max_tokens": 200000
    }
  ],
  "fallback_chain": ["openai", "anthropic", "vllm"]
}

Update Router Configuration

Updates routing strategy or provider configuration.

PropertyValue
MethodPUT
Path/api/v1/llm/config
AuthJWT required (admin)

Request Body

{
  "default_strategy": "quality_optimized",
  "providers": [
    {
      "name": "openai",
      "priority": 1,
      "enabled": true
    }
  ]
}

Routing Strategies

StrategyDescription
cost_optimizedRoutes to the cheapest available provider
latency_optimizedRoutes to the provider with lowest average latency
quality_optimizedRoutes to the provider with highest quality scores
round_robinDistributes requests evenly across providers
fallbackUses providers in priority order, falling back on failure

Get Provider Health

Returns health status and latency metrics for all configured LLM providers.

PropertyValue
MethodGET
Path/api/v1/llm/health
AuthJWT required

Response

{
  "providers": [
    {
      "name": "openai",
      "healthy": true,
      "avg_latency_ms": 850,
      "p99_latency_ms": 2400,
      "success_rate": 0.997,
      "requests_last_hour": 1250
    }
  ]
}

List Routing Decisions

Returns recent routing decisions for analytics and debugging.

PropertyValue
MethodGET
Path/api/v1/llm/decisions
AuthJWT required (admin)

Query Parameters

ParameterTypeRequiredDescription
limitintegernoMax results (default 50)
providerstringnoFilter by provider
strategystringnoFilter by strategy
outcomestringnoFilter by outcome (success, failure, fallback)

Get Usage Summary

Returns token usage and cost summary across providers.

PropertyValue
MethodGET
Path/api/v1/llm/usage
AuthJWT required (admin)

Query Parameters

ParameterTypeRequiredDescription
periodstringnoTime period (hour, day, week, month)
tenant_idstringnoFilter by tenant

Response

{
  "period": "day",
  "total_requests": 5420,
  "total_tokens": 2150000,
  "total_cost_usd": 85.40,
  "by_provider": {
    "openai": {"requests": 4200, "tokens": 1700000, "cost_usd": 68.00},
    "anthropic": {"requests": 1220, "tokens": 450000, "cost_usd": 17.40}
  }
}