Context Graph Architecture

The Context Graph is the knowledge infrastructure backbone of the MATIH platform. It captures agent reasoning, entity relationships, and decision lineage in a multi-backend storage system that combines graph databases, vector stores, and bi-temporal relational storage.

High-Level Architecture

The Context Graph module resides within the AI Service and connects to multiple storage backends depending on the query type and data characteristics.

+---------------------------------------------------+
|              AI Service (Port 8000)                |
|                                                    |
|  +---------------------------------------------+  |
|  |           Context Graph Module               |  |
|  |                                              |  |
|  |  +------------------+ +-------------------+  |  |
|  |  | Agent Thinking   | | Orchestrator      |  |  |
|  |  | Service          | | Hooks             |  |  |
|  |  +--------+---------+ +--------+----------+  |  |
|  |           |                     |             |  |
|  |  +--------v---------+ +--------v----------+  |  |
|  |  | Dgraph Context   | | Kafka Producer /  |  |  |
|  |  | Store            | | Consumer          |  |  |
|  |  +--------+---------+ +--------+----------+  |  |
|  |           |                     |             |  |
|  |  +--------v---------+ +--------v----------+  |  |
|  |  | Hybrid Store     | | Metrics Bridge    |  |  |
|  |  | (GraphRAG)       | |                   |  |  |
|  |  +------------------+ +-------------------+  |  |
|  +---------------------------------------------+  |
+---------------------------------------------------+
        |              |              |
   +----v----+   +----v----+   +----v--------+
   | Dgraph  |   | Pinecone|   | PostgreSQL  |
   | (Graph) |   | (Vector)|   | (Bitemporal)|
   +---------+   +---------+   +-------------+

Storage Backends

The Context Graph uses three complementary storage backends, each optimized for different query patterns.

Backend	Technology	Purpose	Data Types
Graph Store	Dgraph	Entity relationships, thinking traces, traversals	Entities, relationships, traces
Vector Store	Pinecone / Qdrant	Semantic similarity, embedding search	Entity embeddings, decision rationale vectors
Bi-Temporal Store	PostgreSQL + TimescaleDB	Point-in-time queries, version history, decisions	Events, decisions, entity versions

Multi-Tenant Isolation

Every operation in the Context Graph requires a tenant_id parameter. Isolation is enforced at each storage layer:

Dgraph: GraphQL queries filter by tenantId field on all entity types
Pinecone: Namespaces are prefixed with the tenant ID (format: tenant_id:namespace)
PostgreSQL: All tables include a tenant_id column with indexed filters

Data Flow

Agent interactions flow through the Context Graph in a pipeline pattern:

Capture -- Orchestrator hooks intercept agent thinking steps and LLM calls
Persist -- The Thinking Service stores traces in Dgraph with optional Kafka streaming
Embed -- The Thinking Embedding Service generates vectors for input, output, and reasoning
Index -- Vectors are upserted into Pinecone namespaces for similarity search
Query -- The Semantic Search Service combines vector and graph results for retrieval

Key Components

Component	Source Path	Description
`DgraphContextStore`	`storage/dgraph_context_store.py`	CRUD for thinking traces in Dgraph
`HybridStore`	`storage/hybrid_store.py`	Combined graph + vector search (GraphRAG)
`BiTemporalStore`	`storage/bitemporal_store.py`	PostgreSQL bi-temporal storage for decisions
`PineconeVectorStore`	`storage/pinecone_vector_store.py`	Production Pinecone vector operations
`AgentThinkingCaptureService`	`services/agent_thinking_service.py`	Captures agent reasoning during execution
`ContextGraphOrchestratorHooks`	`integration/orchestrator_hooks.py`	Non-invasive integration with orchestrators
`ContextGraphKafkaConsumer`	`integration/kafka_consumer.py`	Consumes events from Kafka topics
`MetricsBridge`	`integration/metrics_bridge.py`	Links observability metrics to thinking traces
`SemanticSearchService`	`services/semantic_search_service.py`	Unified search orchestrator
`ContextGraphAuthorizer`	`security/authorization.py`	RBAC enforcement for API endpoints

GraphRAG Retrieval Strategies

The Hybrid Store implements multiple retrieval strategies based on the Microsoft GraphRAG architecture:

Strategy	Description	Use Case
`LOCAL`	Vector similarity + 1-hop graph neighbors	Quick entity lookup
`GLOBAL`	Community detection + summary embeddings	Broad topic exploration
`DRIFT`	Temporal trajectory patterns	Detecting changes over time
`HYBRID`	Combined vector + graph scoring with configurable weights	Default for most queries
`GRAPH_FIRST`	Graph traversal followed by vector reranking	When structure matters most
`VECTOR_FIRST`	Vector search followed by graph expansion	When semantics matter most

Feature Flag Control

Context Graph features are rolled out per-tenant using the Semantic Feature Flags service. Features can be toggled independently:

Feature Flag	Description
`context_graph_thinking`	Enable agent thinking trace capture
`context_graph_kafka`	Enable Kafka streaming of context events
`context_graph_embeddings`	Enable embedding generation for traces
`context_graph_rbac`	Enable fine-grained RBAC on API endpoints

Rollout modes include disabled, canary (hash-based percentage), partial, and full.

Configuration

The Context Graph is configured through environment variables. No credentials are hardcoded.

Variable	Description	Default
`DGRAPH_GRPC_URL`	Dgraph gRPC endpoint	`localhost:9080`
`DGRAPH_HTTP_URL`	Dgraph HTTP/GraphQL endpoint	`localhost:8080`
`PINECONE_API_KEY`	Pinecone API key (from secret)	--
`PINECONE_ENVIRONMENT`	Pinecone cloud region	`us-east-1-aws`
`PINECONE_INDEX_PREFIX`	Prefix for Pinecone index names	`matih-context`
`BITEMPORAL_DATABASE_URL`	PostgreSQL connection string (from secret)	Falls back to `DATABASE_URL`
`KAFKA_BOOTSTRAP_SERVERS`	Kafka bootstrap servers	`localhost:29092`

Overview Storage Backends Overview