Query Engine Chart

The Query Engine routes SQL queries to the appropriate execution backend (Trino, Spark Connect, ClickHouse) based on query complexity, data size, and catalog type.

Chart Configuration

query-engine:
  enabled: true
  replicaCount: 3
 
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"
    limits:
      cpu: "2"
      memory: "4Gi"
 
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 20
    targetCPUUtilizationPercentage: 70
 
  config:
    execution:
      defaultTimeout: 300
      maxTimeout: 3600
      defaultResultLimit: 10000
      maxResultLimit: 1000000
    router:
      largeScanThresholdBytes: 107374182400  # 100GB
      complexityThreshold: 5
    cache:
      enabled: true
      ttlSeconds: 3600

Query Routing

The query engine inspects each SQL query and routes it to the optimal backend:

Criteria	Backend	Reason
Simple SELECT, small data	Trino	Low latency, federated SQL
Large table scans (100GB+)	Spark Connect	Distributed processing
OLAP aggregations	ClickHouse	Columnar engine
Complex joins with ML features	Spark Connect	DataFrame operations

Autoscaling Profile

Uses the data HPA profile for burst handling:

autoscaling:
  profile: balanced
  maxReplicas: 20  # Higher than default for query bursts

AI Service Catalog Service