Context Graph Architecture
The Context Graph is the knowledge infrastructure backbone of the MATIH platform. It captures agent reasoning, entity relationships, and decision lineage in a multi-backend storage system that combines graph databases, vector stores, and bi-temporal relational storage.
High-Level Architecture
The Context Graph module resides within the AI Service and connects to multiple storage backends depending on the query type and data characteristics.
+---------------------------------------------------+
| AI Service (Port 8000) |
| |
| +---------------------------------------------+ |
| | Context Graph Module | |
| | | |
| | +------------------+ +-------------------+ | |
| | | Agent Thinking | | Orchestrator | | |
| | | Service | | Hooks | | |
| | +--------+---------+ +--------+----------+ | |
| | | | | |
| | +--------v---------+ +--------v----------+ | |
| | | Dgraph Context | | Kafka Producer / | | |
| | | Store | | Consumer | | |
| | +--------+---------+ +--------+----------+ | |
| | | | | |
| | +--------v---------+ +--------v----------+ | |
| | | Hybrid Store | | Metrics Bridge | | |
| | | (GraphRAG) | | | | |
| | +------------------+ +-------------------+ | |
| +---------------------------------------------+ |
+---------------------------------------------------+
| | |
+----v----+ +----v----+ +----v--------+
| Dgraph | | Pinecone| | PostgreSQL |
| (Graph) | | (Vector)| | (Bitemporal)|
+---------+ +---------+ +-------------+Storage Backends
The Context Graph uses three complementary storage backends, each optimized for different query patterns.
| Backend | Technology | Purpose | Data Types |
|---|---|---|---|
| Graph Store | Dgraph | Entity relationships, thinking traces, traversals | Entities, relationships, traces |
| Vector Store | Pinecone / Qdrant | Semantic similarity, embedding search | Entity embeddings, decision rationale vectors |
| Bi-Temporal Store | PostgreSQL + TimescaleDB | Point-in-time queries, version history, decisions | Events, decisions, entity versions |
Multi-Tenant Isolation
Every operation in the Context Graph requires a tenant_id parameter. Isolation is enforced at each storage layer:
- Dgraph: GraphQL queries filter by
tenantIdfield on all entity types - Pinecone: Namespaces are prefixed with the tenant ID (format:
tenant_id:namespace) - PostgreSQL: All tables include a
tenant_idcolumn with indexed filters
Data Flow
Agent interactions flow through the Context Graph in a pipeline pattern:
- Capture -- Orchestrator hooks intercept agent thinking steps and LLM calls
- Persist -- The Thinking Service stores traces in Dgraph with optional Kafka streaming
- Embed -- The Thinking Embedding Service generates vectors for input, output, and reasoning
- Index -- Vectors are upserted into Pinecone namespaces for similarity search
- Query -- The Semantic Search Service combines vector and graph results for retrieval
Key Components
| Component | Source Path | Description |
|---|---|---|
DgraphContextStore | storage/dgraph_context_store.py | CRUD for thinking traces in Dgraph |
HybridStore | storage/hybrid_store.py | Combined graph + vector search (GraphRAG) |
BiTemporalStore | storage/bitemporal_store.py | PostgreSQL bi-temporal storage for decisions |
PineconeVectorStore | storage/pinecone_vector_store.py | Production Pinecone vector operations |
AgentThinkingCaptureService | services/agent_thinking_service.py | Captures agent reasoning during execution |
ContextGraphOrchestratorHooks | integration/orchestrator_hooks.py | Non-invasive integration with orchestrators |
ContextGraphKafkaConsumer | integration/kafka_consumer.py | Consumes events from Kafka topics |
MetricsBridge | integration/metrics_bridge.py | Links observability metrics to thinking traces |
SemanticSearchService | services/semantic_search_service.py | Unified search orchestrator |
ContextGraphAuthorizer | security/authorization.py | RBAC enforcement for API endpoints |
GraphRAG Retrieval Strategies
The Hybrid Store implements multiple retrieval strategies based on the Microsoft GraphRAG architecture:
| Strategy | Description | Use Case |
|---|---|---|
LOCAL | Vector similarity + 1-hop graph neighbors | Quick entity lookup |
GLOBAL | Community detection + summary embeddings | Broad topic exploration |
DRIFT | Temporal trajectory patterns | Detecting changes over time |
HYBRID | Combined vector + graph scoring with configurable weights | Default for most queries |
GRAPH_FIRST | Graph traversal followed by vector reranking | When structure matters most |
VECTOR_FIRST | Vector search followed by graph expansion | When semantics matter most |
Feature Flag Control
Context Graph features are rolled out per-tenant using the Semantic Feature Flags service. Features can be toggled independently:
| Feature Flag | Description |
|---|---|
context_graph_thinking | Enable agent thinking trace capture |
context_graph_kafka | Enable Kafka streaming of context events |
context_graph_embeddings | Enable embedding generation for traces |
context_graph_rbac | Enable fine-grained RBAC on API endpoints |
Rollout modes include disabled, canary (hash-based percentage), partial, and full.
Configuration
The Context Graph is configured through environment variables. No credentials are hardcoded.
| Variable | Description | Default |
|---|---|---|
DGRAPH_GRPC_URL | Dgraph gRPC endpoint | localhost:9080 |
DGRAPH_HTTP_URL | Dgraph HTTP/GraphQL endpoint | localhost:8080 |
PINECONE_API_KEY | Pinecone API key (from secret) | -- |
PINECONE_ENVIRONMENT | Pinecone cloud region | us-east-1-aws |
PINECONE_INDEX_PREFIX | Prefix for Pinecone index names | matih-context |
BITEMPORAL_DATABASE_URL | PostgreSQL connection string (from secret) | Falls back to DATABASE_URL |
KAFKA_BOOTSTRAP_SERVERS | Kafka bootstrap servers | localhost:29092 |