Structural Search
Structural search finds entities based on their position and relationships within the knowledge graph rather than semantic content. It leverages the graph store to discover entities by their connections, lineage paths, and structural patterns.
Overview
While semantic search answers "what entities have similar descriptions?", structural search answers "what entities are connected in similar ways?". This is powered by the ContextGraphStore (Dgraph) and graph traversal algorithms.
Search Patterns
| Pattern | Description | Example |
|---|---|---|
| Neighbor search | Find entities directly connected to a source entity | "What datasets feed into this model?" |
| Lineage traversal | Follow upstream or downstream lineage paths | "What is the full data pipeline for this dashboard?" |
| Structural similarity | Find entities with similar graph neighborhoods | "Find models with similar dependency structures" |
| Relationship filtering | Search by specific relationship types | "Find all TRAINED_ON relationships for this model" |
Neighbor Search
Find entities within a specified number of hops from a source entity:
results = await service.search(SearchQuery(
query="urn:matih:dataset:acme:sales_data",
tenant_id="acme",
mode=SearchMode.STRUCTURAL,
scope=SearchScope.RELATIONSHIPS,
filters=SearchFilters(
max_hops=2,
relationship_types=["DERIVED_FROM", "TRAINED_ON"],
),
))Lineage-Based Search
Structural search integrates with the graph traversal engine for lineage queries:
from context_graph.services.graph_traversal import (
GraphTraversalEngine,
TraversalDirection,
)
engine = get_traversal_engine()
result = await engine.traverse(
start_urn="urn:matih:model:acme:churn_predictor",
tenant_id="acme",
direction=TraversalDirection.UPSTREAM,
max_depth=5,
)Structural Embeddings
For structural similarity search, the Context Graph generates graph structure embeddings using node2vec:
- Dimension: 128
- Method: Random walks on the entity graph, then Word2Vec on walk sequences
- Namespace:
ENTITY_STRUCTURAL
Entities with similar graph neighborhoods will have similar structural embeddings, even if their descriptions differ.
Relationship Categories
| Category | Relationship Types | Description |
|---|---|---|
| Data Lineage | DERIVED_FROM, PRODUCED_BY, CONSUMED_BY | Data flow relationships |
| Model Lineage | TRAINED_ON, EVALUATED_WITH, DEPLOYED_AS | ML lifecycle relationships |
| Ownership | OWNED_BY, MANAGED_BY, CREATED_BY | Responsibility relationships |
| Composition | PART_OF, CONTAINS, MEMBER_OF | Structural composition |
Combining with Semantic Search
Structural search results can be combined with semantic search using the HYBRID mode, which merges both result sets using configurable weights:
results = await service.search(SearchQuery(
query="datasets similar to customer_events with lineage to ML models",
tenant_id="acme",
mode=SearchMode.HYBRID,
top_k=10,
))The Hybrid Store handles the merging with default weights of 0.6 for semantic and 0.4 for structural scores.