Query Engine Chart
The Query Engine routes SQL queries to the appropriate execution backend (Trino, Spark Connect, ClickHouse) based on query complexity, data size, and catalog type.
Chart Configuration
query-engine:
enabled: true
replicaCount: 3
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 20
targetCPUUtilizationPercentage: 70
config:
execution:
defaultTimeout: 300
maxTimeout: 3600
defaultResultLimit: 10000
maxResultLimit: 1000000
router:
largeScanThresholdBytes: 107374182400 # 100GB
complexityThreshold: 5
cache:
enabled: true
ttlSeconds: 3600Query Routing
The query engine inspects each SQL query and routes it to the optimal backend:
| Criteria | Backend | Reason |
|---|---|---|
| Simple SELECT, small data | Trino | Low latency, federated SQL |
| Large table scans (100GB+) | Spark Connect | Distributed processing |
| OLAP aggregations | ClickHouse | Columnar engine |
| Complex joins with ML features | Spark Connect | DataFrame operations |
Autoscaling Profile
Uses the data HPA profile for burst handling:
autoscaling:
profile: balanced
maxReplicas: 20 # Higher than default for query bursts