MATIH Platform is in active MVP development. Documentation reflects current implementation status.
9. Query Engine & SQL
Query Execution
Overview

Query Execution

The Query Engine supports two primary execution modes -- synchronous and asynchronous -- each designed for different workload patterns. This section covers the complete query execution subsystem, from initial submission through validation, optimization, engine routing, execution, and result delivery.


Execution Modes

ModeEndpointBest ForDefault Timeout
SynchronousPOST /v1/queries/executeInteractive queries, small result sets, dashboard queries300 seconds
AsynchronousPOST /v1/queries/execute/asyncLarge scans, batch processing, long-running analyticsNo hard limit

Both modes share the same core pipeline: validation, cache lookup, engine routing, RLS injection, execution, and result caching. The difference is in how the client receives results.


Query Request Structure

Every query begins with a QueryRequest payload submitted to the Query Engine. The request DTO is defined in QueryRequest.java:

public class QueryRequest {
    private UUID queryId;
    private UUID tenantId;
 
    @NotBlank(message = "SQL query is required")
    @Size(max = 100000, message = "Query text cannot exceed 100,000 characters")
    private String sql;
 
    private String catalog;
    private String schema;
    private Integer limit;
    private Integer offset;
    private Boolean useCache;
    private Integer timeoutSeconds;
    private Map<String, Object> parameters;
    private Map<String, String> sessionProperties;
    private Map<String, Object> metadata;
}

Field Reference

FieldTypeRequiredDefaultDescription
sqlStringYes--The SQL query to execute (max 100,000 characters)
catalogStringNodeltaTrino catalog to target
schemaStringNodefaultSchema within the catalog
limitIntegerNo10,000Maximum rows to return
useCacheBooleanNotrueWhether to check and populate the query cache
timeoutSecondsIntegerNo300Timeout in seconds for synchronous execution
parametersMapNo--Parameterized query values
sessionPropertiesMapNo--Trino session properties to set
metadataMapNo--Billing context, cost center, workload type

Query Response Structure

All execution endpoints return a QueryResponse:

{
  "executionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "status": "COMPLETED",
  "engineType": "TRINO",
  "columns": [
    { "name": "customer_id", "type": "VARCHAR", "nullable": true },
    { "name": "total_orders", "type": "BIGINT", "nullable": true }
  ],
  "data": [
    { "customer_id": "C-001", "total_orders": 42 },
    { "customer_id": "C-002", "total_orders": 17 }
  ],
  "rowCount": 2,
  "totalRows": 2,
  "bytesScanned": 1048576,
  "executionTimeMs": 245,
  "cacheHit": false,
  "hasMore": false,
  "submittedAt": "2026-02-12T10:00:00Z",
  "completedAt": "2026-02-12T10:00:00.245Z",
  "statistics": {
    "cpuTimeMs": 180,
    "wallTimeMs": 245,
    "queuedTimeMs": 5,
    "analysisTimeMs": 12,
    "planningTimeMs": 28,
    "peakMemoryBytes": 67108864,
    "inputRows": 50000,
    "inputBytes": 1048576,
    "outputRows": 2,
    "outputBytes": 128,
    "completedSplits": 4
  }
}

Query Status Enumeration

The QueryStatus enum defines the lifecycle states of a query execution:

StatusDescription
PENDINGQuery received, awaiting processing
QUEUEDQuery accepted into the execution queue
RUNNINGQuery actively executing on the target engine
COMPLETEDQuery finished successfully with results available
FAILEDQuery execution failed with an error
CANCELLEDQuery was cancelled by the user or system
TIMEOUTQuery exceeded the configured timeout

Supported Engine Types

The EngineType enum lists all supported execution backends:

EngineUse CaseRouting Trigger
TRINOComplex analytics, window functions, multi-join queriesDefault for most queries; complex SQL patterns
CLICKHOUSEReal-time tables (events, clicks, metrics), simple aggregationsTables matching real-time patterns; simple queries under 1M rows
DUCKDBLocal analytical queries, embedded processingSmall datasets, local file queries
STARROCKSHigh-throughput OLAP, materialized view servingPre-aggregated data, high-concurrency reads
SPARK_ASYNCLarge-scale scans exceeding 100GBEstimated scan size above 100GB threshold

Subsections

PageDescription
Synchronous ExecutionSync execution flow, timeout handling, cache integration
Asynchronous ExecutionAsync submission, status polling, result retrieval
Query LifecycleFull lifecycle from submit through validate, optimize, execute, return
Query CancellationCancelling running and queued queries
Query HistoryQuerying execution history with search and pagination
Query StatisticsExecution statistics, success rates, latency percentiles