Entity Suggestions

The EntitySuggestionService provides auto-completion and typeahead suggestions for entity names, tags, properties, relationship types, and search queries. It uses a trie data structure for fast prefix lookups populated from graph data, combined with fuzzy matching and semantic suggestions.

Overview

Entity suggestions power the search-as-you-type experience in the data catalog and agent workbenches. The service returns ranked suggestions within milliseconds, combining prefix matching with context-aware relevance scoring.

Source: data-plane/ai-service/src/context_graph/services/entity_suggestion_service.py

Suggestion Types

Type	Description	Example
`ENTITY`	Entity names and URNs	"cust..." suggests "customer_events", "customer_features"
`TAG`	Entity tags	"pii..." suggests "pii-sensitive", "pii-masked"
`PROPERTY`	Entity property names	"sche..." suggests "schema_version", "schema_type"
`RELATIONSHIP`	Relationship type names	"DERI..." suggests "DERIVED_FROM"
`QUERY`	Past search queries	"show me..." suggests popular completions
`RECENT`	Recently accessed entities	Entities the user recently viewed
`POPULAR`	Most-accessed entities	Entities with highest access counts

Suggestion Sources

Source	Method	Latency
`PREFIX_MATCH`	Trie-based O(k) prefix lookup	Under 1ms
`FUZZY_MATCH`	Edit distance matching	2-5ms
`HISTORY`	User search history	Under 1ms
`POPULARITY`	Access frequency ranking	Under 1ms
`CONTEXTUAL`	Current entity context	5-10ms
`SEMANTIC`	Embedding similarity	50-100ms
`GRAPH_STORE`	Direct graph query	10-30ms

Configuration

from context_graph.services.entity_suggestion_service import SuggestionConfig
 
config = SuggestionConfig(
    max_suggestions=10,       # Maximum suggestions per request
    min_prefix_length=2,      # Minimum prefix length to trigger
    fuzzy_threshold=0.7,      # Minimum fuzzy match score
    history_weight=0.3,       # Weight for history-based suggestions
    popularity_weight=0.2,    # Weight for popularity-based suggestions
    recency_boost=0.1,        # Score boost for recently accessed entities
)

API Usage

REST Endpoint

GET /api/v1/context-graph/suggest?q=cust&tenant_id=acme&types=entity,tag&limit=10

Python API

service = get_entity_suggestion_service()
 
suggestions = await service.suggest(
    SuggestionRequest(
        prefix="cust",
        tenant_id="acme",
        suggestion_types=[SuggestionType.ENTITY, SuggestionType.TAG],
        max_results=10,
        context_entity_urn="urn:matih:dataset:acme:orders",
    ),
)

Ranking Algorithm

Suggestions are ranked by a composite score:

score = prefix_match_score * 0.4
      + popularity_score * 0.2
      + recency_score * 0.1
      + context_relevance * 0.2
      + fuzzy_bonus * 0.1

When a context_entity_urn is provided, entities that are graph-neighbors of the context entity receive a context relevance boost.

Trie Data Structure

The suggestion service maintains an in-memory trie populated from the graph store on startup and refreshed periodically. The trie provides O(k) prefix lookups where k is the prefix length, enabling sub-millisecond suggestion latency for prefix matches.

Analytics Overview Pattern Mining