Cache Warming
The CacheWarmingService proactively populates the cache with results for frequently-executed queries. This ensures that when users arrive in the morning or after a deployment, the most popular queries are already cached.
Warming Trigger
Cache warming can be triggered manually or on a schedule:
Manual Trigger
# Trigger cache warming for the current tenant
curl -X POST http://query-engine:8080/v1/cache/warm \
-H "Authorization: Bearer $JWT_TOKEN"{
"status": "started",
"message": "Cache warming started",
"tenantId": "550e8400-e29b-41d4-a716-446655440000"
}Warm a Specific Query
curl -X POST http://query-engine:8080/v1/cache/warm/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JWT_TOKEN" \
-d '{
"queryHash": "a3f2b9c1d4e5f6...",
"sql": "SELECT region, SUM(revenue) FROM sales GROUP BY region"
}'Warming Configuration
query:
cache:
warming:
enabled: true
cron-expression: "0 0 6 * * ?" # Daily at 6 AM
max-queries-per-run: 100 # Warm top 100 queries
min-hit-count: 5 # Only warm queries with 5+ hits
lookback-period: P7D # Look at last 7 days of history
delay-between-queries: PT1S # 1 second delay between queriesWarming Process
The warming service uses virtual threads for asynchronous execution:
@PostMapping("/warm")
@PreAuthorize("hasRole('ADMIN') or hasRole('QUERY_ADMIN')")
public ResponseEntity<Map<String, Object>> triggerWarming() {
UUID tenantId = SecurityUtils.getCurrentTenantId();
if (warmingService.isWarmingInProgress(tenantId)) {
return ResponseEntity.ok(Map.of(
"status", "in_progress",
"message", "Cache warming already in progress"
));
}
Thread.startVirtualThread(() -> warmingService.warmTenantCache(tenantId));
return ResponseEntity.accepted().body(Map.of(
"status", "started",
"message", "Cache warming started"
));
}The warming process:
- Queries the cache for entries with the highest hit counts (using
getTopQueriesForWarming) - Filters entries that meet the minimum hit count threshold (default: 5)
- Re-executes each query to refresh the cache
- Inserts a configurable delay between queries to avoid overloading the execution engines
Warming Status
curl http://query-engine:8080/v1/cache/warming/status \
-H "Authorization: Bearer $JWT_TOKEN"{
"inProgress": false,
"lastWarmingAt": "2026-02-12T06:00:00Z",
"queriesWarmed": 87,
"queriesSkipped": 13,
"warmingDurationMs": 145000,
"nextScheduledAt": "2026-02-13T06:00:00Z"
}Query Selection for Warming
The MultiLevelCacheService.getTopQueriesForWarming() method selects queries based on hit count:
public List<CacheEntry> getTopQueriesForWarming(UUID tenantId, int limit) {
// Scan Redis for tenant cache entries
// Filter by warmable flag and minimum hit count
// Sort by hit count descending
// Return top N entries
return entries.stream()
.sorted((a, b) -> Long.compare(b.getHitCount(), a.getHitCount()))
.limit(limit)
.collect(Collectors.toList());
}Only entries with warmable: true and at least minHitCount hits are considered. Entries that were manually invalidated or expired naturally are re-executed to refresh the cache.