MATIH Platform is in active MVP development. Documentation reflects current implementation status.
9. Query Engine & SQL
Query Execution
Query Lifecycle

Query Lifecycle

Every query that enters the Query Engine passes through a deterministic sequence of phases. This page documents the complete lifecycle from initial HTTP request to final result delivery, including the state machine transitions that govern each phase.


Lifecycle Phases

  Submit          Validate        Route           Execute         Return
  +-----+       +--------+      +-------+       +---------+     +--------+
  | HTTP |------>| Tenant |----->| Smart |------>| Engine  |---->| Format |
  | POST |       | + Auth |      | Router|       | Strategy|     | + Cache|
  +-----+       +--------+      +-------+       +---------+     +--------+
     |               |               |               |               |
  PENDING         PENDING         PENDING         RUNNING       COMPLETED
                                                     |
                                              FAILED/TIMEOUT/CANCELLED

Phase 1: Submission

The query lifecycle begins when an HTTP request reaches the QueryController. The controller extracts the tenant and user identifiers from HTTP headers and delegates to the QueryExecutionService.

@PostMapping("/execute")
public ResponseEntity<QueryResponse> executeSync(
        @RequestHeader("X-Tenant-ID") UUID tenantId,
        @RequestHeader("X-User-ID") UUID userId,
        @Valid @RequestBody QueryRequest request) {
 
    QueryResponse response = executionService.executeSync(tenantId, userId, request);
    return ResponseEntity.ok(response);
}

At this point, the query is in the PENDING state. The execution service receives the validated request and begins processing.


Phase 2: Validation

The execution service performs several validation checks before proceeding:

  1. Concurrent query limit check. The service counts active queries for the tenant and rejects the request if the limit (default: 100) is exceeded.
  2. Request validation. Jakarta Bean Validation checks that sql is not blank and does not exceed 100,000 characters.
  3. Query hash computation. A SHA-256 hash of the SQL text is computed for cache lookups and deduplication.
validateConcurrentQueryLimit(tenantId);
String queryHash = calculateQueryHash(request.getSql());

Phase 3: Cache Lookup

If useCache is true, the service checks the multi-level cache for a previously computed result with the same query hash and tenant ID:

if (Boolean.TRUE.equals(request.getUseCache())) {
    QueryResponse cachedResult = cacheService.get(tenantId, queryHash);
    if (cachedResult != null) {
        recordCacheHit(tenantId);
        return cachedResult.toBuilder().cacheHit(true).build();
    }
}

A cache hit short-circuits the entire execution pipeline, returning results immediately without touching the execution engine.


Phase 4: Engine Routing

The SmartQueryRouter analyzes the SQL statement using JSqlParser and pattern matching to determine the optimal execution engine:

EngineType engineType = queryRouter.route(request);

The router applies a decision tree based on query characteristics:

PriorityConditionSelected Engine
1Real-time tables (events, clicks, metrics, logs)ClickHouse
2Estimated scan exceeds 100GBSpark Async
3Small query (under 1M rows) without joins or windowsClickHouse
4Window functions or more than 2 joinsTrino
5DefaultTrino

Phase 5: Execution Record Creation

Before execution begins, a QueryExecution entity is persisted to PostgreSQL for audit and history:

QueryExecution execution = QueryExecution.builder()
        .id(UUID.randomUUID())
        .tenantId(tenantId)
        .userId(userId)
        .queryText(request.getSql())
        .queryHash(queryHash)
        .engineType(engineType)
        .status(QueryStatus.PENDING)
        .submittedAt(Instant.now())
        .cacheHit(false)
        .build();
execution = executionRepository.save(execution);

Phase 6: Engine Execution

The execution record transitions to RUNNING and the appropriate engine strategy is invoked:

execution.markRunning();
executionRepository.save(execution);
 
QueryEngineStrategy strategy = strategyFactory.getStrategy(engineType);
int timeout = request.getTimeoutSeconds() != null
        ? request.getTimeoutSeconds()
        : syncTimeoutSeconds;
 
QueryResponse result = strategy.execute(request, timeout);

Each engine strategy (Trino, ClickHouse, DuckDB, StarRocks, Spark) implements the QueryEngineStrategy interface:

public interface QueryEngineStrategy {
    QueryResponse execute(QueryRequest request, int timeoutSeconds);
    EngineType getEngineType();
    boolean isAvailable();
    void healthCheck();
}

Phase 7: Result Processing and Caching

After successful execution, the service:

  1. Updates the execution record with row count and bytes scanned
  2. Caches the result if caching is enabled
  3. Attaches the execution ID and engine type to the response
  4. Records Micrometer timing metrics
execution.markCompleted(
    result.getRowCount() != null ? result.getRowCount() : 0,
    result.getBytesScanned() != null ? result.getBytesScanned() : 0
);
executionRepository.save(execution);
 
if (Boolean.TRUE.equals(request.getUseCache()) && result.getStatus() == QueryStatus.COMPLETED) {
    cacheService.put(tenantId, queryHash, result);
}

State Machine Transitions

The query status follows a strict state machine:

PENDING ----> QUEUED ----> RUNNING ----> COMPLETED
   |                          |
   |                          +--------> FAILED
   |                          |
   |                          +--------> TIMEOUT
   |                          |
   +------------------------> CANCELLED
TransitionTrigger
PENDING -> QUEUEDAsync query accepted into queue
PENDING -> RUNNINGSync execution starts
QUEUED -> RUNNINGQueue scheduler picks up query
RUNNING -> COMPLETEDEngine returns results successfully
RUNNING -> FAILEDEngine throws exception
RUNNING -> TIMEOUTExecution exceeds timeout
Any -> CANCELLEDUser cancels the query

Error Handling

The execution service handles three categories of exceptions:

ExceptionStatusAction
QueryTimeoutExceptionFAILEDRecord timeout message, emit status=timeout metric
QueryCancelledExceptionCANCELLEDRecord cancellation, emit cancelled counter
Exception (general)FAILEDRecord error message, emit status=error metric, wrap in QueryExecutionException

All exception handlers ensure the execution record is updated in the database before the exception propagates to the controller layer.