Cache Warming

The CacheWarmingService proactively populates the cache with results for frequently-executed queries. This ensures that when users arrive in the morning or after a deployment, the most popular queries are already cached.

Warming Trigger

Cache warming can be triggered manually or on a schedule:

Manual Trigger

# Trigger cache warming for the current tenant
curl -X POST http://query-engine:8080/v1/cache/warm \
  -H "Authorization: Bearer $JWT_TOKEN"

{
  "status": "started",
  "message": "Cache warming started",
  "tenantId": "550e8400-e29b-41d4-a716-446655440000"
}

Warm a Specific Query

curl -X POST http://query-engine:8080/v1/cache/warm/query \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "queryHash": "a3f2b9c1d4e5f6...",
    "sql": "SELECT region, SUM(revenue) FROM sales GROUP BY region"
  }'

Warming Configuration

query:
  cache:
    warming:
      enabled: true
      cron-expression: "0 0 6 * * ?"     # Daily at 6 AM
      max-queries-per-run: 100             # Warm top 100 queries
      min-hit-count: 5                     # Only warm queries with 5+ hits
      lookback-period: P7D                 # Look at last 7 days of history
      delay-between-queries: PT1S          # 1 second delay between queries

Warming Process

The warming service uses virtual threads for asynchronous execution:

@PostMapping("/warm")
@PreAuthorize("hasRole('ADMIN') or hasRole('QUERY_ADMIN')")
public ResponseEntity<Map<String, Object>> triggerWarming() {
    UUID tenantId = SecurityUtils.getCurrentTenantId();
 
    if (warmingService.isWarmingInProgress(tenantId)) {
        return ResponseEntity.ok(Map.of(
                "status", "in_progress",
                "message", "Cache warming already in progress"
        ));
    }
 
    Thread.startVirtualThread(() -> warmingService.warmTenantCache(tenantId));
 
    return ResponseEntity.accepted().body(Map.of(
            "status", "started",
            "message", "Cache warming started"
    ));
}

The warming process:

Queries the cache for entries with the highest hit counts (using getTopQueriesForWarming)
Filters entries that meet the minimum hit count threshold (default: 5)
Re-executes each query to refresh the cache
Inserts a configurable delay between queries to avoid overloading the execution engines

Warming Status

curl http://query-engine:8080/v1/cache/warming/status \
  -H "Authorization: Bearer $JWT_TOKEN"

{
  "inProgress": false,
  "lastWarmingAt": "2026-02-12T06:00:00Z",
  "queriesWarmed": 87,
  "queriesSkipped": 13,
  "warmingDurationMs": 145000,
  "nextScheduledAt": "2026-02-13T06:00:00Z"
}

Query Selection for Warming

The MultiLevelCacheService.getTopQueriesForWarming() method selects queries based on hit count:

public List<CacheEntry> getTopQueriesForWarming(UUID tenantId, int limit) {
    // Scan Redis for tenant cache entries
    // Filter by warmable flag and minimum hit count
    // Sort by hit count descending
    // Return top N entries
    return entries.stream()
            .sorted((a, b) -> Long.compare(b.getHitCount(), a.getHitCount()))
            .limit(limit)
            .collect(Collectors.toList());
}

Only entries with warmable: true and at least minHitCount hits are considered. Entries that were manually invalidated or expired naturally are re-executed to refresh the cache.

Adaptive Policies Cache Analytics