MATIH Platform is in active MVP development. Documentation reflects current implementation status.
10a. Data Ingestion
CDC & Replication

CDC & Replication

Change Data Capture (CDC) enables real-time data replication from source databases to the Matih data platform via Airbyte's logical replication support.

Supported CDC Sources

DatabaseReplication MethodRequirements
PostgreSQLLogical replication (pgoutput/wal2json)wal_level = logical, publication + replication slot
MySQLBinlog replication (GTID)binlog_format = ROW, replication user
SQL ServerCT (Change Tracking)Enable CT on database + tables
MongoDBChange StreamsReplica set or sharded cluster
OracleLogMinerSupplemental logging enabled

Configuring CDC

Step 1: Create Source with CDC Support

POST /api/v1/sources
{
  "name": "prod-postgres",
  "connectorType": "postgres",
  "connectionConfig": {
    "host": "db.example.com",
    "port": 5432,
    "database": "mydb",
    "username": "replication_user",
    "password": "..."
  }
}

Step 2: Enable CDC on the Connection

POST /api/v1/connections/{connectionId}/cdc
{
  "replicationSlot": "matih_slot",
  "publicationName": "matih_publication",
  "snapshotMode": "INITIAL",
  "heartbeatEnabled": true,
  "heartbeatInterval": 300000
}

Step 3: Verify CDC Status

GET /api/v1/connections/{connectionId}/cdc

Snapshot Modes

ModeBehavior
INITIALFull snapshot on first sync, then CDC for subsequent changes
INITIAL_ONLYFull snapshot only, no ongoing CDC
WHEN_NEEDEDSnapshot only when no valid offset exists
NEVERSkip snapshot, start from current position (may miss historical data)

Sync Lifecycle with CDC

The connection FSM supports CDC-specific transitions:

CREATED → TESTING → TESTED → SCHEMA_DISCOVERED → CONFIGURED → SCHEDULED

                                              SYNCING → COMPLETED / FAILED
                                                ↑          ↓
                                              RETRYING ←──┘

When CDC is configured, the sync mode is automatically set to CDC and Airbyte uses logical replication instead of full table scans.

RBAC

PermissionRoles
connections:writeDATA_ENGINEER, PLATFORM_ADMIN
connections:readDATA_ENGINEER, DATA_ANALYST, DATA_SCIENTIST

DATA_ANALYST and DATA_SCIENTIST can view CDC configuration but cannot modify it.