Reasoning Analytics

Reasoning analytics provides insights into agent thinking patterns, model performance, cost optimization, and path frequency analysis. The analytics are powered by aggregations over thinking traces stored in Dgraph and exposed through the analytics API endpoints.

Overview

The analytics subsystem answers operational questions such as:

Which LLM models are most cost-effective for different task types?
What are the most common agent reasoning paths?
Where do agents spend the most time and tokens?
Which reasoning patterns lead to successful outcomes?

Analytics Endpoints

Endpoint	Method	Description
`/api/v1/context-graph/analytics/model-performance`	GET	Per-model token, cost, and quality statistics
`/api/v1/context-graph/analytics/path-analysis`	GET	Agent path frequency and outcome analysis
`/api/v1/context-graph/analytics/cost-breakdown`	GET	Cost breakdown by model, tenant, or session
`/api/v1/context-graph/analytics/latency-distribution`	GET	Latency percentiles per step type

Model Performance Statistics

The model performance endpoint returns per-model metrics:

Metric	Description
`total_calls`	Total number of calls to this model
`total_input_tokens`	Total input tokens consumed
`total_output_tokens`	Total output tokens produced
`total_thinking_tokens`	Total thinking/reasoning tokens
`total_cost_usd`	Total estimated cost in USD
`avg_latency_ms`	Average call latency
`avg_confidence`	Average confidence score
`p50_latency_ms`	Median latency
`p95_latency_ms`	95th percentile latency
`p99_latency_ms`	99th percentile latency

Path Analysis

Path analysis identifies the most common sequences of agent steps and correlates them with outcomes:

Metric	Description
`path`	Ordered list of step types taken
`count`	Number of traces following this path
`success_count`	Traces with successful outcome
`failure_count`	Traces with failed outcome
`success_rate`	Percentage of successful traces
`avg_duration_ms`	Average execution time for this path
`avg_cost_usd`	Average cost for this path

Cost Breakdown

Cost breakdown can be grouped by different dimensions:

Dimension	Description
`by_model`	Cost per LLM model
`by_step_type`	Cost per thinking step type
`by_session`	Cost per user session
`by_time`	Cost over time periods

Latency Distribution

Latency metrics are provided as percentile distributions per step type:

{
  "step_type": "SQL_GENERATION",
  "p50_ms": 450,
  "p75_ms": 780,
  "p90_ms": 1200,
  "p95_ms": 1800,
  "p99_ms": 3500,
  "count": 15234
}

RBAC Protection

All analytics endpoints are RBAC-protected. Users need the context_graph:metrics:read permission to access analytics data. The visibility level determines how much detail is returned:

Permission	Visibility	Data Shown
`context_graph:traces:read`	Summary	Aggregate counts and rates
`context_graph:metrics:read`	Standard	Per-model and per-path breakdowns
`context_graph:thinking:read`	Detailed	Individual trace details
`context_graph:admin`	Full	Raw data including tokens and costs

Query Parameters

All analytics endpoints accept common query parameters:

Parameter	Type	Description
`tenant_id`	string	Required. Tenant scope
`start_time`	datetime	Start of the time window
`end_time`	datetime	End of the time window
`model_id`	string	Filter by specific model
`session_id`	string	Filter by specific session

Thinking Embeddings Analytics Overview