MATIH Platform is in active MVP development. Documentation reflects current implementation status.
11. Pipelines & Data Engineering
Flink Streaming Jobs
Overview

Flink Streaming Jobs Overview

MATIH runs four production Flink SQL jobs that consume events from Kafka topics and aggregate them into Iceberg tables for analytics. All jobs use tumbling windows and are deployed to the Flink cluster managed by the Flink Operator in the matih-data-plane namespace.


Production Jobs

JobSource TopicSink TableWindow
Session Analyticsmatih.ai.state-changesmatih_analytics.session_analytics15 min
Agent Performancematih.ai.agent-tracesmatih_analytics.agent_performance_metrics5 min
LLM Operationsmatih.ai.llm-opsmatih_analytics.llm_operations_metrics15 min
State Transition CDCPostgreSQL CDCmatih_analytics.state_transition_logContinuous

Source Files

JobFile
Session Analyticsinfrastructure/flink/jobs/session-analytics.sql
Agent Performanceinfrastructure/flink/jobs/agent-performance-agg.sql
LLM Operationsinfrastructure/flink/jobs/llm-operations-agg.sql
State Transition CDCinfrastructure/flink/jobs/state-transition-cdc.sql

Common Configuration

All streaming jobs share the following Kafka source configuration:

PropertyValue
Kafka bootstrap serversstrimzi-kafka-kafka-bootstrap.matih-data-plane.svc.cluster.local:9093
Kafka port (dev)9092 (plaintext), override before submitting
Kafka port (prod)9093 (TLS)
Scan startup modelatest-offset
Message formatJSON with ISO-8601 timestamps
Watermark delay30 seconds

Sink Configuration

All jobs write to Iceberg tables via the Polaris catalog:

INSERT INTO polaris.matih_analytics.<table_name>
SELECT ...

The Iceberg catalog is configured in the Flink SQL Gateway session properties.


Deployment

Jobs are submitted to the Flink SQL Gateway:

# Dev: override Kafka port to plaintext
export KAFKA_PORT=9092
envsubst < infrastructure/flink/jobs/session-analytics.sql | \
    curl -X POST http://flink-sql-gateway:8083/v1/statements \
    -H "Content-Type: application/json" \
    -d '{"statement": "..."}'

Monitoring

MetricSourceDescription
Checkpoint durationFlink metricsTime to complete state checkpoints
Records in/outFlink metricsThroughput per operator
Watermark lagFlink metricsDelay between event time and processing time
Kafka consumer lagKafka metricsOffset lag per consumer group