MATIH Platform is in active MVP development. Documentation reflects current implementation status.
21. Industry Examples & Walkthroughs
SaaS & Technology
ML Engineer Journey

ML Engineer Journey: Intelligent Feature Recommendation Engine

Persona: Raj Patel, ML Engineer at CloudFlow Goal: Build a recommendation system that suggests relevant product features to users based on their behavior patterns and similar user profiles, increasing feature adoption from 23% to 40% and reducing time-to-value for new users.

Primary Workbenches: ML Workbench, Data Workbench Supporting Services: Ingestion Service, Catalog Service, Query Engine, Pipeline Service, ML Service (Ray Serve), AI Service


Business Context

CloudFlow has 10 core feature areas (kanban boards, gantt charts, timelines, automations, forms, dashboards, integrations, reports, API access, team management) -- but the average workspace only uses 3.4 of them. The product team ships features that users never discover. The onboarding flow is generic: every user sees the same getting-started wizard regardless of their role, team size, or industry.

Raj's mandate: build a recommendation engine that surfaces the right feature to the right user at the right time, driving adoption and stickiness. This directly feeds Zara's churn model -- feature adoption is the #2 predictor of retention.

  Current State                         Target State
  ┌────────────────────────┐            ┌────────────────────────┐
  │ Generic onboarding     │            │ Personalized feature   │
  │ Same flow for everyone │            │ recommendations        │
  │                        │            │                        │
  │ 3.4 features adopted   │  ────▶     │ 5.2 features adopted   │
  │ 23% adoption rate      │            │ 40% adoption rate      │
  │ 2.4 day time-to-value  │            │ < 1 day time-to-value  │
  └────────────────────────┘            └────────────────────────┘

Stage 1: Ingestion

Raj configures the data sources needed for the recommendation engine, focusing on user behavior signals and feature metadata.

Event Stream Configuration

The primary data source is the product event stream, already flowing through Kafka via Segment. Raj verifies the ingestion configuration:

{
  "source": "segment_kafka",
  "connector": "airbyte/source-kafka",
  "config": {
    "topic_pattern": "segment.cloudflow.events.*",
    "consumer_group": "matih-recommender-ingestion",
    "sync_mode": "streaming",
    "deserialization": "json",
    "event_filter": {
      "include_types": [
        "feature_used", "feature_discovered", "page_viewed",
        "task_created", "task_completed", "board_created",
        "automation_created", "integration_connected",
        "report_generated", "dashboard_created", "form_created",
        "gantt_viewed", "timeline_created", "api_key_generated"
      ],
      "exclude_types": ["heartbeat", "session_keepalive"]
    }
  }
}

Additional Sources

SourcePurposeSync ModeFrequency
Product PostgreSQL (users)User profiles, roles, team sizeCDC (incremental)Every 15 min
Product PostgreSQL (feature_flags)Feature flag configs, rollout stateCDC (incremental)Every 15 min
Product PostgreSQL (workspaces)Workspace metadata, plan, industryCDC (incremental)Every 15 min
A/B test assignmentsExperiment cohort dataCDC (incremental)Real-time
Zendesk (support tickets)Feature-related support topicsIncrementalEvery 15 min

Event Schema

{
  "event_id": "evt_8f3a2b1c",
  "user_id": "usr_12345",
  "workspace_id": "ws_67890",
  "event_type": "feature_used",
  "properties": {
    "feature_category": "automations",
    "feature_name": "create_automation_rule",
    "context": "project_settings",
    "session_id": "sess_abc123",
    "session_sequence": 14,
    "time_on_page_seconds": 45,
    "is_first_use": true
  },
  "user_traits": {
    "role": "project_manager",
    "team_size": 12,
    "plan": "business",
    "signup_cohort": "2025-Q4"
  },
  "timestamp": "2026-02-28T14:23:17Z"
}

Stage 2: Discovery

Raj explores the data catalog to understand feature usage patterns, map the feature taxonomy, and identify cold-start challenges.

Event Volume Analysis

-- Profile event volume and feature coverage
SELECT
    DATE_TRUNC('week', timestamp)               AS week,
    COUNT(*)                                     AS total_events,
    COUNT(DISTINCT user_id)                      AS unique_users,
    COUNT(DISTINCT
        JSON_EXTRACT_SCALAR(properties, '$.feature_category')
    )                                            AS distinct_features,
    COUNT(CASE WHEN JSON_EXTRACT_SCALAR(properties, '$.is_first_use') = 'true'
          THEN 1 END)                            AS first_use_events
FROM events
WHERE event_type = 'feature_used'
  AND timestamp >= CURRENT_DATE - INTERVAL '12' WEEK
GROUP BY 1
ORDER BY 1;

Volume profile: 50M events/day across 347 distinct event types. Feature usage events account for approximately 12M events/day after filtering.

Feature-to-Capability Ontology

Raj builds a mapping from raw event types to feature capabilities, which becomes the item space for the recommendation engine:

-- Build feature ontology from event patterns and feature flags
SELECT
    ff.flag_id,
    ff.name                                      AS feature_name,
    ff.rollout_percentage,
    ff.target_segments,
    COUNT(DISTINCT e.user_id)                    AS users_30d,
    COUNT(DISTINCT e.workspace_id)               AS workspaces_30d,
    COUNT(*)                                     AS events_30d,
    MIN(e.timestamp)                             AS first_event,
    MAX(e.timestamp)                             AS last_event
FROM feature_flags ff
LEFT JOIN events e ON JSON_EXTRACT_SCALAR(e.properties, '$.feature_category')
    = ff.name
    AND e.timestamp >= CURRENT_DATE - INTERVAL '30' DAY
WHERE ff.enabled = true
GROUP BY ff.flag_id, ff.name, ff.rollout_percentage, ff.target_segments
ORDER BY users_30d DESC;
FeatureUsers (30d)WorkspacesRolloutAdoption Rate
kanban_boards52,0003,800100%52%
task_management48,0003,600100%48%
file_sharing31,0002,900100%31%
automations8,2001,400100%8.2%
gantt_charts6,100980100%6.1%
custom_dashboards4,30072080%5.4%
api_access3,100510100%3.1%
forms2,800480100%2.8%
advanced_reports2,20039060%3.7%
timelines1,900340100%1.9%

Cold-Start User Profiling

-- Profile users who signed up in the last 14 days (cold-start population)
SELECT
    u.role,
    w.plan,
    w.seat_count AS team_size_bucket,
    COUNT(DISTINCT u.user_id)                    AS new_users,
    AVG(COALESCE(activity.event_count, 0))       AS avg_events,
    AVG(COALESCE(activity.features_used, 0))     AS avg_features_used
FROM users u
JOIN workspaces w ON u.workspace_id = w.workspace_id
LEFT JOIN (
    SELECT user_id,
           COUNT(*) AS event_count,
           COUNT(DISTINCT JSON_EXTRACT_SCALAR(properties, '$.feature_category'))
               AS features_used
    FROM events
    WHERE timestamp >= CURRENT_DATE - INTERVAL '14' DAY
    GROUP BY user_id
) activity ON u.user_id = activity.user_id
WHERE u.signup_date >= CURRENT_DATE - INTERVAL '14' DAY
GROUP BY u.role, w.plan, w.seat_count
ORDER BY new_users DESC;

Cold-start users (< 5 interactions) represent 35% of the active user base at any given time. The recommendation engine must handle these users using contextual features (role, team size, plan) rather than behavioral history.


Stage 3: Query

Raj builds the user-feature interaction matrix and contextual feature vectors that will feed the recommendation models.

User-Feature Interaction Matrix

-- Build interaction matrix: users x features with interaction strength
CREATE TABLE ml.user_feature_interactions AS
WITH interactions AS (
    SELECT
        e.user_id,
        JSON_EXTRACT_SCALAR(e.properties, '$.feature_category') AS feature,
        COUNT(*)                                     AS use_count,
        COUNT(DISTINCT DATE(e.timestamp))            AS active_days,
        MAX(e.timestamp)                             AS last_used,
        MIN(e.timestamp)                             AS first_used,
        -- Recency-weighted interaction score
        SUM(1.0 / (1 + DATE_DIFF('day', DATE(e.timestamp), CURRENT_DATE)))
                                                     AS recency_weighted_score
    FROM events e
    WHERE e.event_type = 'feature_used'
      AND e.timestamp >= CURRENT_DATE - INTERVAL '90' DAY
    GROUP BY e.user_id,
             JSON_EXTRACT_SCALAR(e.properties, '$.feature_category')
)
SELECT
    user_id,
    feature,
    use_count,
    active_days,
    last_used,
    first_used,
    recency_weighted_score,
    -- Normalize to 0-1 interaction strength
    ROUND(
        (0.4 * LEAST(use_count / 100.0, 1.0)) +
        (0.3 * LEAST(active_days / 30.0, 1.0)) +
        (0.3 * recency_weighted_score / 10.0),
    3) AS interaction_strength
FROM interactions;

Session-Level Feature Usage Sequences

Understanding the order in which users discover features helps predict what they should try next:

-- Feature usage sequences within sessions
WITH feature_sequences AS (
    SELECT
        e.user_id,
        e.session_id,
        JSON_EXTRACT_SCALAR(e.properties, '$.feature_category') AS feature,
        ROW_NUMBER() OVER (PARTITION BY e.session_id
            ORDER BY e.timestamp)                    AS seq_num
    FROM events e
    WHERE e.event_type = 'feature_used'
      AND e.timestamp >= CURRENT_DATE - INTERVAL '30' DAY
)
-- Find common feature transitions (feature A -> feature B)
SELECT
    a.feature                                        AS from_feature,
    b.feature                                        AS to_feature,
    COUNT(*)                                         AS transition_count,
    COUNT(DISTINCT a.user_id)                        AS unique_users,
    ROUND(COUNT(*) * 1.0 /
        SUM(COUNT(*)) OVER (PARTITION BY a.feature), 3) AS transition_prob
FROM feature_sequences a
JOIN feature_sequences b
    ON a.session_id = b.session_id
    AND b.seq_num = a.seq_num + 1
    AND a.feature != b.feature
GROUP BY a.feature, b.feature
HAVING COUNT(*) >= 50
ORDER BY transition_count DESC
LIMIT 20;
From FeatureTo FeatureTransitionsUsersProbability
kanban_boardsautomations12,4003,2000.18
task_managementgantt_charts8,9002,8000.14
kanban_boardscustom_dashboards7,2002,1000.11
file_sharingforms5,8001,9000.09
automationsadvanced_reports4,1001,2000.15

Collaborative Filtering Signals

-- Users similar to you also use these features
-- Compute user similarity based on feature overlap (Jaccard index)
WITH user_features AS (
    SELECT DISTINCT user_id, feature
    FROM ml.user_feature_interactions
    WHERE interaction_strength >= 0.2
),
user_pairs AS (
    SELECT
        a.user_id AS user_a,
        b.user_id AS user_b,
        COUNT(*)  AS shared_features,
        (SELECT COUNT(DISTINCT feature) FROM user_features
         WHERE user_id = a.user_id)                  AS features_a,
        (SELECT COUNT(DISTINCT feature) FROM user_features
         WHERE user_id = b.user_id)                  AS features_b
    FROM user_features a
    JOIN user_features b ON a.feature = b.feature AND a.user_id < b.user_id
    GROUP BY a.user_id, b.user_id
)
SELECT
    user_a,
    user_b,
    shared_features,
    ROUND(shared_features * 1.0 /
        (features_a + features_b - shared_features), 3) AS jaccard_similarity
FROM user_pairs
WHERE shared_features >= 3
ORDER BY jaccard_similarity DESC;

Context Features

-- User context vector for cold-start and contextual bandit
SELECT
    u.user_id,
    u.role,
    w.plan,
    w.seat_count,
    CASE
        WHEN w.seat_count <= 5   THEN 'small'
        WHEN w.seat_count <= 20  THEN 'medium'
        WHEN w.seat_count <= 100 THEN 'large'
        ELSE 'enterprise'
    END                                              AS team_size_bucket,
    DATE_DIFF('day', u.signup_date, CURRENT_DATE)    AS account_age_days,
    COALESCE(fi.features_used, 0)                    AS features_used_count,
    COALESCE(fi.total_events, 0)                     AS total_events_7d
FROM users u
JOIN workspaces w ON u.workspace_id = w.workspace_id
LEFT JOIN (
    SELECT user_id,
           COUNT(DISTINCT JSON_EXTRACT_SCALAR(properties, '$.feature_category'))
               AS features_used,
           COUNT(*) AS total_events
    FROM events
    WHERE timestamp >= CURRENT_DATE - INTERVAL '7' DAY
    GROUP BY user_id
) fi ON u.user_id = fi.user_id;

Stage 4: Orchestration

Raj builds a two-phase pipeline: daily batch processing for model training data and ANN index updates, plus a real-time serving path.

Training Pipeline

{
  "pipeline": {
    "name": "recommender-training-daily",
    "schedule": "0 4 * * *",
    "description": "Daily retraining of recommendation models and ANN index refresh",
    "stages": [
      {
        "name": "build_interaction_matrix",
        "type": "sql_transform",
        "query_ref": "interaction_matrix_v2.sql",
        "output_table": "ml.user_feature_interactions",
        "timeout_minutes": 45
      },
      {
        "name": "compute_transition_probs",
        "type": "sql_transform",
        "depends_on": ["build_interaction_matrix"],
        "query_ref": "feature_transitions.sql",
        "output_table": "ml.feature_transition_probs"
      },
      {
        "name": "build_context_vectors",
        "type": "sql_transform",
        "depends_on": ["build_interaction_matrix"],
        "query_ref": "user_context_vectors.sql",
        "output_table": "ml.user_context_vectors"
      },
      {
        "name": "quality_gate",
        "type": "data_quality",
        "depends_on": ["build_interaction_matrix", "compute_transition_probs", "build_context_vectors"],
        "checks": [
          {
            "name": "interaction_matrix_coverage",
            "type": "custom_sql",
            "query": "SELECT COUNT(DISTINCT user_id) FROM ml.user_feature_interactions",
            "min_value": 60000,
            "severity": "critical"
          },
          {
            "name": "feature_coverage",
            "type": "custom_sql",
            "query": "SELECT COUNT(DISTINCT feature) FROM ml.user_feature_interactions",
            "min_value": 8,
            "severity": "critical"
          }
        ]
      },
      {
        "name": "train_matrix_factorization",
        "type": "model_training",
        "depends_on": ["quality_gate"],
        "model_type": "als_matrix_factorization",
        "input_table": "ml.user_feature_interactions",
        "hyperparameters": {
          "factors": 64,
          "regularization": 0.01,
          "iterations": 15,
          "alpha": 40
        },
        "output": "models:/feature-recommender-mf/staging"
      },
      {
        "name": "train_contextual_bandit",
        "type": "model_training",
        "depends_on": ["quality_gate"],
        "model_type": "contextual_bandit",
        "input_tables": ["ml.user_context_vectors", "ml.recommendation_feedback"],
        "hyperparameters": {
          "exploration_rate": 0.1,
          "decay_factor": 0.995
        },
        "output": "models:/feature-recommender-bandit/staging"
      },
      {
        "name": "rebuild_ann_index",
        "type": "custom",
        "depends_on": ["train_matrix_factorization"],
        "command": "build_ann_index",
        "config": {
          "embedding_source": "models:/feature-recommender-mf/staging",
          "index_type": "hnsw",
          "metric": "cosine",
          "ef_construction": 200,
          "M": 16
        }
      },
      {
        "name": "evaluate_and_promote",
        "type": "model_evaluation",
        "depends_on": ["train_matrix_factorization", "train_contextual_bandit"],
        "metrics": ["ndcg@5", "hit_rate@3", "diversity_score"],
        "promotion_criteria": {
          "ndcg@5": "> 0.35",
          "hit_rate@3": "> 0.40"
        },
        "promote_to": "production"
      }
    ]
  }
}

Pipeline Execution Flow

Pipeline Run: 2026-02-28 04:00 UTC
  ├── build_interaction_matrix ......... PASSED (22m 18s, 4.2M rows)
  ├── compute_transition_probs ......... PASSED (8m 45s)
  ├── build_context_vectors ............ PASSED (5m 12s)
  ├── quality_gate
  │   ├── interaction_matrix_coverage .. PASSED (72,400 users)
  │   └── feature_coverage ............. PASSED (10 features)
  ├── train_matrix_factorization ....... PASSED (34m 22s)
  ├── train_contextual_bandit .......... PASSED (18m 05s)
  ├── rebuild_ann_index ................ PASSED (6m 44s)
  └── evaluate_and_promote
      ├── ndcg@5 ....................... 0.41 (threshold: 0.35) PASSED
      ├── hit_rate@3 ................... 0.47 (threshold: 0.40) PASSED
      ├── diversity_score .............. 0.72
      └── promotion .................... Promoted to production

Stage 5: Analysis

Raj analyzes recommendation quality, identifies biases, and validates that the system provides genuine value rather than just recommending popular features.

Popularity Bias Check

-- Check if recommendations are biased toward already-popular features
WITH recommendation_distribution AS (
    SELECT
        recommended_feature,
        COUNT(*)                                     AS times_recommended,
        COUNT(DISTINCT user_id)                      AS unique_users_shown
    FROM ml.recommendation_log
    WHERE generated_at >= CURRENT_DATE - INTERVAL '7' DAY
    GROUP BY recommended_feature
),
usage_distribution AS (
    SELECT
        feature,
        COUNT(DISTINCT user_id)                      AS current_users
    FROM ml.user_feature_interactions
    GROUP BY feature
)
SELECT
    r.recommended_feature,
    r.times_recommended,
    u.current_users,
    ROUND(r.times_recommended * 1.0 /
        SUM(r.times_recommended) OVER (), 3)         AS rec_share,
    ROUND(u.current_users * 1.0 /
        SUM(u.current_users) OVER (), 3)             AS usage_share,
    ROUND(r.times_recommended * 1.0 /
        SUM(r.times_recommended) OVER () -
        u.current_users * 1.0 /
        SUM(u.current_users) OVER (), 3)             AS popularity_bias
FROM recommendation_distribution r
JOIN usage_distribution u ON r.recommended_feature = u.feature
ORDER BY popularity_bias DESC;
FeatureRec ShareUsage ShareBiasAssessment
automations0.220.08+0.14Good: recommending underadopted, high-value feature
gantt_charts0.150.06+0.09Good: targeting relevant users
custom_dashboards0.140.05+0.09Good: pushing discovery
kanban_boards0.080.52-0.44Good: NOT over-recommending already-popular feature
task_management0.060.48-0.42Good: NOT over-recommending already-popular feature

The model correctly avoids recommending features that users likely already know about, and promotes discovery of underadopted features with high value.

Cold-Start User Performance

-- Recommendation quality for cold-start vs warm users
SELECT
    CASE
        WHEN user_event_count < 5   THEN 'Cold Start (< 5 events)'
        WHEN user_event_count < 50  THEN 'Warm (5-49 events)'
        ELSE 'Active (50+ events)'
    END                                              AS user_segment,
    COUNT(*)                                         AS recommendations,
    ROUND(AVG(CASE WHEN clicked = 1 THEN 1.0 ELSE 0 END), 3)
                                                     AS click_through_rate,
    ROUND(AVG(CASE WHEN adopted_7d = 1 THEN 1.0 ELSE 0 END), 3)
                                                     AS adoption_rate_7d
FROM ml.recommendation_feedback
WHERE generated_at >= CURRENT_DATE - INTERVAL '30' DAY
GROUP BY 1
ORDER BY 1;
SegmentRecommendationsCTR7d Adoption
Cold Start (< 5 events)48,2000.120.04
Warm (5-49 events)82,4000.190.09
Active (50+ events)34,6000.240.14

Cold-start users have lower CTR as expected, but the contextual bandit still outperforms the random baseline (0.03 CTR) by 4x. The role and team_size context features carry most of the signal for new users.

Recommendation Diversity Analysis

Raj ensures recommendations are diverse -- showing the same 3 features repeatedly does not help adoption:

-- Measure recommendation diversity per user (7-day window)
SELECT
    ROUND(avg_unique_recs / total_recs, 2)           AS diversity_score,
    COUNT(*)                                         AS user_count
FROM (
    SELECT
        user_id,
        COUNT(DISTINCT recommended_feature)          AS avg_unique_recs,
        COUNT(*)                                     AS total_recs
    FROM ml.recommendation_log
    WHERE generated_at >= CURRENT_DATE - INTERVAL '7' DAY
    GROUP BY user_id
) per_user
GROUP BY 1
ORDER BY 1;

Average diversity score: 0.72 (users see 7.2 unique features out of 10 total over a week). Target: > 0.60 to ensure sufficient exploration.


Stage 6: Productionization

Raj deploys a two-stage recommendation system: candidate generation (ANN index lookup) followed by ranking (contextual bandit), served via Ray Serve.

System Architecture

                  Recommendation Serving Architecture

  User Login / Page Load


  ┌──────────────────┐     ┌───────────────────┐
  │  API Gateway     │────▶│  Ray Serve         │
  │  /api/v1/recs    │     │  (2 replicas)      │
  └──────────────────┘     │                    │
                           │  Stage 1: Candidate│
                           │  Generation        │
                           │  ┌──────────────┐  │
                           │  │ ANN Index    │  │
                           │  │ (HNSW)       │  │
                           │  │ Top 20       │  │
                           │  │ candidates   │  │
                           │  └──────┬───────┘  │
                           │         │          │
                           │  Stage 2: Ranking  │
                           │  ┌──────────────┐  │
                           │  │ Contextual   │  │
                           │  │ Bandit       │  │
                           │  │ Top 3        │  │
                           │  │ ranked       │  │
                           │  └──────┬───────┘  │
                           └─────────┼──────────┘


                           ┌──────────────────┐
                           │  In-App Display   │
                           │  - Tooltip        │
                           │  - Sidebar        │
                           │  - Onboarding     │
                           └──────────────────┘

Deployment Configuration

{
  "deployment": {
    "name": "feature-recommender-v2",
    "serving": {
      "runtime": "ray_serve",
      "endpoint": "/api/v1/recommendations",
      "num_replicas": 2,
      "max_concurrent_queries": 100,
      "autoscaling": {
        "min_replicas": 2,
        "max_replicas": 8,
        "target_qps_per_replica": 200
      }
    },
    "models": {
      "candidate_generator": {
        "model_uri": "models:/feature-recommender-mf/production",
        "ann_index": "indexes:/feature-recommender-hnsw/latest",
        "top_k": 20
      },
      "ranker": {
        "model_uri": "models:/feature-recommender-bandit/production",
        "output_k": 3,
        "exploration_rate": 0.1
      }
    },
    "performance_sla": {
      "p50_latency_ms": 35,
      "p99_latency_ms": 100,
      "availability": 0.999
    },
    "response_format": {
      "recommendations": [
        {
          "feature": "automations",
          "score": 0.87,
          "reason": "Teams similar to yours increased productivity 23% with automations",
          "cta_text": "Try Automations",
          "cta_url": "/features/automations/getting-started"
        }
      ]
    }
  }
}

API Response Example

{
  "user_id": "usr_12345",
  "recommendations": [
    {
      "feature": "automations",
      "score": 0.87,
      "reason": "Teams your size with Kanban boards typically adopt automations next",
      "cta_text": "Automate your workflow",
      "cta_url": "/features/automations/getting-started",
      "position": 1
    },
    {
      "feature": "custom_dashboards",
      "score": 0.72,
      "reason": "Track your project metrics in one view",
      "cta_text": "Create a Dashboard",
      "cta_url": "/features/dashboards/templates",
      "position": 2
    },
    {
      "feature": "integrations",
      "score": 0.68,
      "reason": "Connect CloudFlow with the tools your team already uses",
      "cta_text": "Browse Integrations",
      "cta_url": "/settings/integrations",
      "position": 3
    }
  ],
  "model_version": "feature-recommender-v2",
  "generated_at": "2026-02-28T14:23:17Z",
  "latency_ms": 28
}

Stage 7: Feedback

Raj configures a comprehensive feedback loop to track recommendation quality and business impact in real time.

Feedback Metrics

{
  "monitoring": {
    "model": "feature-recommender-v2",
    "realtime_metrics": [
      {
        "name": "click_through_rate",
        "type": "engagement",
        "definition": "recommendations clicked / recommendations shown",
        "window": "1h",
        "alert_threshold": "< 0.08",
        "target": 0.18
      },
      {
        "name": "feature_adoption_after_rec",
        "type": "business_outcome",
        "definition": "users who used recommended feature within 7 days / users shown recommendation",
        "window": "7d_rolling",
        "target": 0.12
      },
      {
        "name": "time_to_first_use",
        "type": "business_outcome",
        "definition": "median time from recommendation shown to first use of recommended feature",
        "window": "30d",
        "target_hours": 48
      }
    ],
    "quality_metrics": [
      {
        "name": "recommendation_diversity",
        "type": "quality",
        "definition": "unique features recommended / total recommendations per user per week",
        "target": 0.65
      },
      {
        "name": "novelty_score",
        "type": "quality",
        "definition": "avg inverse popularity of recommended features",
        "target": 0.40,
        "note": "Higher novelty means recommending less-obvious features"
      },
      {
        "name": "coverage",
        "type": "quality",
        "definition": "features that appear in at least 1% of recommendations / total features",
        "target": 0.80
      }
    ],
    "serving_metrics": [
      {
        "name": "p99_latency",
        "threshold_ms": 100,
        "alert": "pagerduty://ml-oncall"
      },
      {
        "name": "error_rate",
        "threshold": 0.001,
        "alert": "slack://ml-alerts"
      }
    ]
  }
}

Feedback Collection Pipeline

-- Track recommendation feedback (runs every hour)
INSERT INTO ml.recommendation_feedback
SELECT
    r.recommendation_id,
    r.user_id,
    r.recommended_feature,
    r.score,
    r.position,
    r.generated_at,
    -- Did user click the recommendation?
    CASE WHEN c.click_timestamp IS NOT NULL THEN 1 ELSE 0 END AS clicked,
    c.click_timestamp,
    -- Did user adopt the feature within 7 days?
    CASE WHEN e.first_use_timestamp IS NOT NULL
         AND e.first_use_timestamp <= r.generated_at + INTERVAL '7' DAY
         THEN 1 ELSE 0 END AS adopted_7d,
    e.first_use_timestamp
FROM ml.recommendation_log r
LEFT JOIN ml.recommendation_clicks c
    ON r.recommendation_id = c.recommendation_id
LEFT JOIN (
    SELECT user_id,
           JSON_EXTRACT_SCALAR(properties, '$.feature_category') AS feature,
           MIN(timestamp) AS first_use_timestamp
    FROM events
    WHERE event_type = 'feature_used'
      AND JSON_EXTRACT_SCALAR(properties, '$.is_first_use') = 'true'
    GROUP BY user_id, JSON_EXTRACT_SCALAR(properties, '$.feature_category')
) e ON r.user_id = e.user_id AND r.recommended_feature = e.feature
WHERE r.generated_at >= CURRENT_TIMESTAMP - INTERVAL '8' DAY
  AND r.recommendation_id NOT IN (SELECT recommendation_id FROM ml.recommendation_feedback);

Stage 8: Experimentation

Raj runs structured experiments to optimize the recommendation strategy, placement, and personalization approach.

Experiment 1: Personalization Strategy

{
  "experiment": {
    "name": "rec-personalization-strategy",
    "hypothesis": "Hybrid collaborative + content-based filtering outperforms either alone",
    "variants": [
      {
        "name": "collaborative_only",
        "description": "Matrix factorization based on user-feature interactions only",
        "allocation": 0.25
      },
      {
        "name": "content_based",
        "description": "Recommend based on user role, team size, and industry similarity",
        "allocation": 0.25
      },
      {
        "name": "hybrid",
        "description": "Two-stage: MF candidates + contextual bandit ranking (current model)",
        "allocation": 0.25
      },
      {
        "name": "sequence_aware",
        "description": "Hybrid + feature transition probabilities as additional signal",
        "allocation": 0.25
      }
    ],
    "primary_metric": "feature_adoption_7d",
    "secondary_metrics": ["click_through_rate", "diversity_score", "p99_latency"],
    "duration_weeks": 4,
    "min_sample_per_variant": 5000
  }
}

Results (week 4):

StrategyCTR7d AdoptionDiversityP99 Latency
Collaborative only0.160.080.5842ms
Content-based0.140.070.7122ms
Hybrid (current)0.190.110.7268ms
Sequence-aware0.210.130.6974ms

Sequence-aware hybrid shows the best adoption rate. Raj plans to promote it to production after validating latency at higher traffic volumes.

Experiment 2: Recommendation Placement

{
  "experiment": {
    "name": "rec-placement-test",
    "hypothesis": "In-context tooltips near related features outperform sidebar suggestions",
    "variants": [
      {
        "name": "onboarding_wizard",
        "description": "Recommendations shown during initial onboarding flow only",
        "allocation": 0.25
      },
      {
        "name": "sidebar_panel",
        "description": "Persistent sidebar with 'Recommended for You' section",
        "allocation": 0.25
      },
      {
        "name": "in_context_tooltip",
        "description": "Contextual tooltips appearing near related features during use",
        "allocation": 0.25
      },
      {
        "name": "email_digest",
        "description": "Weekly email with personalized feature recommendations",
        "allocation": 0.25
      }
    ],
    "primary_metric": "feature_adoption_30d",
    "secondary_metrics": ["user_satisfaction", "dismissal_rate"],
    "duration_weeks": 6
  }
}

Results (week 6):

Placement30d AdoptionCTRDismissal RateUser Satisfaction
Onboarding wizard0.150.220.084.1/5
Sidebar panel0.090.110.343.4/5
In-context tooltip0.270.310.124.3/5
Email digest0.060.04N/A3.8/5

In-context tooltips achieve 27% feature adoption -- 3x better than the sidebar and nearly 2x better than the onboarding wizard. Users respond best when they see recommendations at the moment they are doing related work.

Combined Impact

After deploying the optimized recommendation engine (sequence-aware hybrid model + in-context tooltip placement):

MetricBeforeAfterChange
Average features adopted per workspace3.44.8+41%
Feature adoption rate (30d)23%34%+48%
Time to value (new users)2.4 days1.1 days-54%
Churn rate (accounts with > 5 features)3.5%3.1%-11%

Summary

StageKey ActionPlatform ComponentOutcome
1. IngestionConfigured event stream, user profiles, feature flags, A/B assignmentsIngestion Service (Airbyte + Kafka)50M events/day flowing with real-time and batch sources
2. DiscoveryMapped 347 event types, built feature ontology, profiled cold-start usersData Workbench CatalogIdentified 10 feature categories, 35% cold-start population
3. QueryBuilt interaction matrix, feature sequences, collaborative signals, context vectorsQuery Engine (Trino)4.2M interaction rows, transition probability graph
4. OrchestrationDaily pipeline for model training, ANN index refresh, evaluation, promotionPipeline Service (Temporal)Automated training with quality gates and auto-promotion
5. AnalysisValidated no popularity bias, analyzed cold-start performance, checked diversityData Workbench + Query EngineModel recommends underadopted features, 4x better than random for cold-start
6. ProductionizationTwo-stage system (ANN + contextual bandit) on Ray Serve, P99 < 100msML Service (Ray Serve)Real-time recommendations via API, in-app tooltip delivery
7. FeedbackCTR, adoption rate, diversity, novelty tracking; hourly feedback pipelineML Workbench Model RegistryAutomated monitoring with retraining triggers
8. ExperimentationTested 4 personalization strategies, 4 placements; 27% adoption with in-context tooltipsML Workbench Experiments+41% feature adoption, -54% time to value

Related Walkthroughs