Thinking Traces
A thinking trace is a structured record of an agent's complete reasoning process for a single request. It captures the goal, every intermediate thinking step, all LLM API calls made, token usage, cost, and the final outcome. Traces are persisted to Dgraph and streamed to Kafka for downstream analytics.
Trace Lifecycle
- Start -- A trace is created when an agent begins processing a request
- Record Steps -- Each reasoning step is appended to the trace
- Record API Calls -- Each LLM or external API call is logged
- Complete -- The trace is finalized with an outcome and total metrics
Trace Structure
| Field | Type | Description |
|---|---|---|
trace_id | string | Unique identifier for the trace |
parent_trace_id | string | Parent trace ID for nested agent calls |
tenant_id | string | Tenant scope |
session_id | string | User session identifier |
actor_urn | string | URN of the agent or user |
goal | string | Natural language description of the task |
status | string | active, completed, failed, cancelled |
outcome | string | success, partial, failure |
total_thinking_tokens | int | Total thinking/reasoning tokens consumed |
total_input_tokens | int | Total input tokens across all steps |
total_output_tokens | int | Total output tokens across all steps |
total_cost_usd | float | Total estimated cost in USD |
total_duration_ms | float | Total execution time in milliseconds |
model_ids_used | list | List of model IDs used during execution |
path_taken | list | Ordered list of agent steps taken |
Usage
Start a Thinking Trace
service = AgentThinkingCaptureService(dgraph_store=store)
trace_id = await service.start_thinking_trace(
tenant_id="acme",
session_id="sess-123",
actor_urn="urn:matih:agent:acme:bi-agent",
goal="Show me total sales by region",
)Record a Thinking Step
await service.record_thinking_step(
trace_id,
AgentThinkingStep(
step_type=ThinkingStepType.INTENT_ANALYSIS,
reasoning="User wants aggregated sales data grouped by region",
confidence=0.95,
model_id="gpt-4",
token_usage=TokenUsageRecord(
input_tokens=150,
output_tokens=50,
thinking_tokens=200,
),
),
)Complete a Thinking Trace
await service.complete_thinking_trace(
trace_id,
outcome="success",
)Thinking Step Fields
| Field | Type | Description |
|---|---|---|
step_id | string | Unique step identifier |
step_type | enum | Type of reasoning step |
sequence_number | int | Order within the trace |
reasoning | string | The agent's reasoning text |
reasoning_hash | string | SHA-256 hash for deduplication |
input_summary | string | Summary of input to this step |
output_summary | string | Summary of output from this step |
alternatives_considered | list | Other options the agent considered |
selected_alternative | string | Which alternative was chosen |
confidence | float | Confidence score (0-1) |
duration_ms | float | Step execution time |
model_id | string | LLM model used for this step |
API Call Records
Each external API call is recorded with:
| Field | Type | Description |
|---|---|---|
call_id | string | Unique call identifier |
api_type | string | Type of API (e.g., llm, database, tool) |
endpoint | string | API endpoint called |
method | string | HTTP method |
status_code | int | HTTP response status |
latency_ms | float | Call latency in milliseconds |
request_tokens | int | Tokens in the request |
response_tokens | int | Tokens in the response |
cost_usd | float | Estimated cost |
error | string | Error message if the call failed |
Querying Traces
traces = await store.query_thinking_traces(
tenant_id="acme",
session_id="sess-123",
status="completed",
limit=50,
offset=0,
)