MATIH Platform is in active MVP development. Documentation reflects current implementation status.
10a. Data Ingestion
Performance Tuning

Performance Tuning

Each ingestion connection can be individually tuned for batch size, parallelism, checkpointing, retries, and dead letter queue behavior.

Tuning Configuration

PUT /api/v1/connections/{connectionId}/tuning
X-Tenant-Id: {tenantId}
{
  "batchSize": 50000,
  "parallelism": 4,
  "checkpointIntervalSec": 120,
  "maxRetries": 5,
  "retryBackoffMs": 2000,
  "dlqEnabled": true
}

Configuration Parameters

ParameterDefaultRangeDescription
batchSize10,000100 - 10,000,000Records per batch. Larger = faster throughput, more memory
parallelism11 - 32Concurrent worker threads for the sync
checkpointIntervalSec30030 - 3600How often to save sync progress for fault tolerance
maxRetries30 - 10Max retry attempts on transient failures
retryBackoffMs1000100 - 60000Base backoff between retries (exponential)
dlqEnabledtruetrue/falseWhether to send failed records to dead letter queue

Sync Lifecycle FSM

The connection sync state machine governs the full lifecycle:

CREATED → TESTING → TESTED → SCHEMA_DISCOVERED → CONFIGURED → SCHEDULED

                                              SYNCING → COMPLETED / FAILED
                                                ↑          ↓
                                              RETRYING ←──┘ (if retryable)

Guards enforce data integrity:

  • CONFIGURE requires selectedStreams to be non-empty
  • SCHEDULE requires a valid cron expression
  • SYNCING can be reached from CONFIGURED (manual) or SCHEDULED
  • COMPLETED can return to SCHEDULED for the next run

Fault Tolerance

Checkpointing

  • Sync progress is saved at the configured interval
  • If a sync is interrupted, it can resume from the last checkpoint
  • Checkpoint data includes: offset, LSN, cursor position, batch number

Dead Letter Queue

  • Failed records are published to {tenantId}.ingestion.dlq Kafka topic
  • DLQ records include: original key, value, error message, retry count
  • Replay endpoint: POST /api/v1/dlq/{recordId}/replay

Stuck Sync Detection

  • The polling service checks RUNNING syncs every 30 seconds
  • Syncs without an Airbyte job ID are marked FAILED after 30 minutes
  • Each job runs in its own database transaction (10-second timeout)

RBAC

OperationPermission
View tuning configconnections:read
Update tuning configingestion:admin
Delete tuning configingestion:admin

Only DATA_ENGINEER and PLATFORM_ADMIN roles have ingestion:admin permission.