MATIH Platform is in active MVP development. Documentation reflects current implementation status.
14. Context Graph & Ontology
Analytics & Ranking
Feedback Scoring

Feedback Scoring

The FeedbackScoringService provides multi-factor quality assessment for agent traces using both explicit user feedback and implicit behavioral signals. It computes composite scores for correctness, completeness, and efficiency to drive reinforcement learning, pattern quality ranking, and performance monitoring.


Overview

Feedback scoring closes the loop between agent execution and continuous improvement. By capturing both explicit signals (user ratings, corrections) and implicit signals (retries, downstream usage), the service builds a comprehensive quality picture for each trace.

Source: data-plane/ai-service/src/context_graph/services/feedback_scoring_service.py


Feedback Types

TypeDescriptionExample
EXPLICITDirect user feedbackThumbs up/down rating
IMPLICITInferred from user behaviorUser retried the query
COMPUTEDCalculated from metricsEfficiency score from step count
AUTOMATEDSystem-generatedPattern match confirmation

Feedback Sources

SourceSignalScore Range
USER_RATINGUser gave a rating-1 to 1
USER_CORRECTIONUser corrected the result-0.5 to 0
USER_RETRYUser retried the task-0.3 to 0
USER_ABANDONUser abandoned the task-1 to -0.5
DOWNSTREAM_USEResult was used downstream0.5 to 1
OUTCOME_SUCCESSTask achieved its goal0.5 to 1
OUTCOME_FAILURETask failed-1 to -0.5
PATTERN_MATCHMatched an expected pattern0 to 0.5
EFFICIENCYEfficiency computation0 to 1

Composite Trace Score

Each trace receives a composite score with three dimensions:

DimensionDescriptionHow Computed
CorrectnessDid the trace achieve its goal?Weighted average of outcome and user feedback signals
CompletenessDid it follow the expected pattern?Pattern match score from pattern mining
EfficiencyWas it optimal?Inverse of step count relative to pattern average

Feedback Records

record = FeedbackRecord(
    trace_urn="urn:matih:trace:acme:trace-123",
    tenant_id="acme",
    feedback_type=FeedbackType.EXPLICIT,
    feedback_source=FeedbackSource.USER_RATING,
    score=0.8,
    confidence=1.0,
    actor_urn="urn:matih:user:acme:analyst-1",
    comment="Accurate results, good visualization",
)

Score Aggregation

When multiple feedback signals exist for a trace, they are aggregated using confidence-weighted averaging:

aggregate_score = sum(score_i * confidence_i) / sum(confidence_i)

More recent feedback is weighted higher using a temporal decay function.


Use Cases

Use CaseDescription
Reinforcement learningFeedback scores drive agent improvement signals
Pattern quality rankingHigher-scored patterns are preferred for future routing
Agent performance dashboardsTrack agent quality trends over time
User satisfaction trackingMonitor explicit user satisfaction per tenant
A/B testingCompare feedback scores between agent configurations

Integration Points

ComponentIntegration
Agent OrchestratorReceives outcome signals after execution
Pattern MiningPattern match scores from discovered patterns
Decision RankingOutcome confidence from feedback signals
Analytics APIFeedback data accessible via analytics endpoints