Feedback and RLHF
The Feedback integration enables users to rate AI responses, provide corrections, and submit detailed feedback that feeds into a reinforcement learning from human feedback (RLHF) pipeline. Feedback events flow through Kafka to a processing pipeline that generates learning signals for model improvement and quality tracking.
Feedback Architecture
The feedback system is organized into four layers:
| Layer | Component | Location |
|---|---|---|
| Collection | Multi-source collectors | src/feedback/collectors/ |
| Transport | Kafka event streaming | src/feedback/integration/kafka/ |
| Processing | Feedback pipeline | src/feedback/pipeline/ |
| Learning | RLHF signal generation | src/feedback/learning/ |
Feedback Types
| Type | Trigger | Data Collected |
|---|---|---|
thumbs_up | User clicks approve | Response ID, session context |
thumbs_down | User clicks reject | Response ID, session context |
correction | User edits SQL or response | Original, corrected version, diff |
rating | User assigns 1-5 stars | Numeric score, optional comment |
implicit_accept | User uses the generated SQL | Query execution event |
implicit_reject | User reformulates question | Follow-up message analysis |
escalation | User requests human help | Session context, frustration signals |
Feedback Event Schema
{
"event_id": "fb-abc123",
"event_type": "feedback.submitted",
"tenant_id": "acme-corp",
"user_id": "user-456",
"session_id": "sess-xyz789",
"response_id": "resp-001",
"feedback_type": "correction",
"feedback_source": "explicit",
"data": {
"original_sql": "SELECT * FROM sales",
"corrected_sql": "SELECT * FROM sales WHERE region = 'US'",
"comment": "Need to filter by US region"
},
"timestamp": "2025-03-15T10:05:00Z"
}Collection API
Feedback is collected through REST endpoints in src/feedback/api/:
POST /api/v1/feedback/submit
POST /api/v1/feedback/correction
POST /api/v1/feedback/rating
GET /api/v1/feedback/summary?session_id=...Processing Pipeline
The feedback pipeline processes raw events into actionable learning signals:
- Deduplication: Filters duplicate feedback events within a time window
- Enrichment: Attaches session context, query metadata, and user profile
- Classification: Categorizes feedback by type and severity
- Signal Generation: Produces learning signals for model fine-tuning
- Aggregation: Computes quality scores per agent, query type, and schema
Learning Signals
Learning signals are published to the ai-learning-signals Kafka topic:
{
"signal_type": "sql_correction",
"input": "What were sales last month?",
"expected_output": "SELECT * FROM sales WHERE date >= ...",
"actual_output": "SELECT * FROM sales",
"reward": -0.5,
"context": {
"schema_tables": ["sales"],
"tenant_id": "acme-corp"
}
}Quality Tracking
Feedback metrics are aggregated and exposed for monitoring:
| Metric | Description |
|---|---|
feedback_positive_rate | Percentage of positive feedback |
feedback_correction_rate | Percentage of responses requiring correction |
feedback_response_time | Average time between response and feedback |
feedback_escalation_rate | Percentage of sessions escalated to human |
Insights
The feedback insights module in src/feedback/insights/ generates periodic reports identifying:
- Most common correction patterns by schema table
- Agent types with lowest satisfaction scores
- Query categories with highest escalation rates
- Trending feedback themes across tenants