MATIH Platform is in active MVP development. Documentation reflects current implementation status.
12. AI Service
Integrations
Feedback & RLHF

Feedback and RLHF

The Feedback integration enables users to rate AI responses, provide corrections, and submit detailed feedback that feeds into a reinforcement learning from human feedback (RLHF) pipeline. Feedback events flow through Kafka to a processing pipeline that generates learning signals for model improvement and quality tracking.


Feedback Architecture

The feedback system is organized into four layers:

LayerComponentLocation
CollectionMulti-source collectorssrc/feedback/collectors/
TransportKafka event streamingsrc/feedback/integration/kafka/
ProcessingFeedback pipelinesrc/feedback/pipeline/
LearningRLHF signal generationsrc/feedback/learning/

Feedback Types

TypeTriggerData Collected
thumbs_upUser clicks approveResponse ID, session context
thumbs_downUser clicks rejectResponse ID, session context
correctionUser edits SQL or responseOriginal, corrected version, diff
ratingUser assigns 1-5 starsNumeric score, optional comment
implicit_acceptUser uses the generated SQLQuery execution event
implicit_rejectUser reformulates questionFollow-up message analysis
escalationUser requests human helpSession context, frustration signals

Feedback Event Schema

{
  "event_id": "fb-abc123",
  "event_type": "feedback.submitted",
  "tenant_id": "acme-corp",
  "user_id": "user-456",
  "session_id": "sess-xyz789",
  "response_id": "resp-001",
  "feedback_type": "correction",
  "feedback_source": "explicit",
  "data": {
    "original_sql": "SELECT * FROM sales",
    "corrected_sql": "SELECT * FROM sales WHERE region = 'US'",
    "comment": "Need to filter by US region"
  },
  "timestamp": "2025-03-15T10:05:00Z"
}

Collection API

Feedback is collected through REST endpoints in src/feedback/api/:

POST /api/v1/feedback/submit
POST /api/v1/feedback/correction
POST /api/v1/feedback/rating
GET  /api/v1/feedback/summary?session_id=...

Processing Pipeline

The feedback pipeline processes raw events into actionable learning signals:

  1. Deduplication: Filters duplicate feedback events within a time window
  2. Enrichment: Attaches session context, query metadata, and user profile
  3. Classification: Categorizes feedback by type and severity
  4. Signal Generation: Produces learning signals for model fine-tuning
  5. Aggregation: Computes quality scores per agent, query type, and schema

Learning Signals

Learning signals are published to the ai-learning-signals Kafka topic:

{
  "signal_type": "sql_correction",
  "input": "What were sales last month?",
  "expected_output": "SELECT * FROM sales WHERE date >= ...",
  "actual_output": "SELECT * FROM sales",
  "reward": -0.5,
  "context": {
    "schema_tables": ["sales"],
    "tenant_id": "acme-corp"
  }
}

Quality Tracking

Feedback metrics are aggregated and exposed for monitoring:

MetricDescription
feedback_positive_ratePercentage of positive feedback
feedback_correction_ratePercentage of responses requiring correction
feedback_response_timeAverage time between response and feedback
feedback_escalation_ratePercentage of sessions escalated to human

Insights

The feedback insights module in src/feedback/insights/ generates periodic reports identifying:

  • Most common correction patterns by schema table
  • Agent types with lowest satisfaction scores
  • Query categories with highest escalation rates
  • Trending feedback themes across tenants