MATIH Platform is in active MVP development. Documentation reflects current implementation status.
10a. Data Ingestion
Connectors
Connector Catalog

Connector Catalog

The Matih platform provides access to 600+ data source connectors through its integration with Airbyte. Connectors handle the complexities of authentication, pagination, rate limiting, schema discovery, incremental extraction, and error recovery for each source type.


Connector Categories

CategoryCountDescription
Databases30+Relational databases, NoSQL databases, data warehouses
SaaS200+CRM, marketing, finance, support, project management, analytics
Cloud Storage10+Object storage, SFTP, file-based sources
APIs100+REST, GraphQL, and webhook-based integrations
Event Streams10+Kafka, Kinesis, Pub/Sub, EventHub

Connector Architecture

Every connector follows the same lifecycle within the Matih ingestion subsystem.

+-------------------+     +-----------------+     +------------------+
| Connector Config  |---->| Airbyte Source   |---->| Schema Discovery |
| (credentials,     |     | (connector      |     | (streams,        |
|  connection params)|     |  container)     |     |  columns, types) |
+-------------------+     +-----------------+     +--------+---------+
                                                           |
                                                  +--------v---------+
                                                  | Stream Selection |
                                                  | (user picks      |
                                                  |  tables to sync) |
                                                  +--------+---------+
                                                           |
                                                  +--------v---------+
                                                  | Sync Execution   |
                                                  | (extract, load   |
                                                  |  to Iceberg)     |
                                                  +------------------+

Connector Capabilities

Each connector declares its supported capabilities during schema discovery.

CapabilityDescription
Full RefreshRe-extracts all data on every sync. Supported by all connectors.
Incremental - AppendExtracts only new rows since the last sync using a cursor column.
Incremental - DedupedExtracts new and updated rows, deduplicating by primary key.
CDCCaptures inserts, updates, and deletes from the database transaction log. Only available for databases with log-based replication (PostgreSQL WAL, MySQL binlog, MongoDB oplog).

Connector Configuration Model

All connectors are configured through the CreateSourceRequest API.

{
  "name": "my-postgres-source",
  "description": "Production orders database",
  "connectorType": "postgres",
  "connectionConfig": {
    "host": "orders-db.example.com",
    "port": 5432,
    "database": "orders",
    "username": "readonly_user",
    "password": "********",
    "ssl_mode": "require",
    "schemas": ["public"]
  }
}

The connectorType field identifies which Airbyte connector to use. The connectionConfig map contains connector-specific parameters.


Listing Available Connectors

The Ingestion Service provides an API endpoint to list all available connector types.

GET /api/v1/sources/connector-types

Response:

[
  "postgres",
  "mysql",
  "mongodb",
  "mssql",
  "oracle",
  "salesforce",
  "hubspot",
  "stripe",
  "s3",
  "gcs",
  "azure-blob-storage",
  "sftp",
  "github",
  "jira",
  "zendesk",
  "google-analytics",
  "snowflake",
  "bigquery",
  "redshift",
  ...
]

Connector Support Tiers

TierDefinitionExamples
Generally AvailableFully tested, production-ready, supported by the platform teamPostgreSQL, MySQL, MongoDB, Salesforce, S3
BetaFunctional but may have edge cases. Community-maintained.Notion, Airtable, Asana
AlphaExperimental. Use with caution.Custom REST API connectors

Common Configuration Patterns

Database Connectors

All database connectors share a common configuration structure:

FieldTypeDescription
hoststringDatabase server hostname or IP address
portintegerDatabase server port
databasestringDatabase name
usernamestringAuthentication username
passwordstringAuthentication password
ssl_modestringSSL configuration (disable, require, verify-ca, verify-full)
schemasstring[]Schemas to include in discovery (optional, defaults to all)

SaaS Connectors

SaaS connectors typically use OAuth2 or API key authentication:

FieldTypeDescription
api_keystringAPI key for key-based authentication
client_idstringOAuth2 client ID
client_secretstringOAuth2 client secret
refresh_tokenstringOAuth2 refresh token
start_datestringISO 8601 date for initial data extraction window

Cloud Storage Connectors

Cloud storage connectors read files from object stores:

FieldTypeDescription
bucketstringBucket or container name
path_prefixstringPath prefix to filter objects
file_formatstringExpected file format (csv, parquet, json, avro)
access_key_idstringCloud provider access key
secret_access_keystringCloud provider secret key
regionstringCloud provider region

Next Steps