Datasource Mapping
Datasource mapping connects logical object types to their physical data sources. The mapping engine analyzes source schemas, suggests property mappings, and maintains the link between the semantic ontology and underlying tables, APIs, or files.
Source: data-plane/ontology-service/src/schema_mapping/
Mapping Architecture
Physical Schema (table/columns) ──> Schema Analyzer ──> Mapping Suggestions
|
Logical Object Type (properties) <── Schema Mapper <─── Approved MappingsDatasource Types
| Type | Description | Connection |
|---|---|---|
jdbc | Relational database tables | PostgreSQL, MySQL, SQL Server |
iceberg | Iceberg tables via Polaris catalog | Spark, Trino |
api | REST API endpoints | HTTP connections |
file | File-based sources | S3, GCS, Azure Blob |
kafka | Kafka topics | Strimzi Kafka |
Mapping Model
| Field | Type | Description |
|---|---|---|
id | UUID | Mapping identifier |
object_type_id | UUID | Target object type |
datasource_type | DatasourceType | Source type (jdbc, iceberg, api, file) |
connection | string | Connection reference |
source_table | string | Physical table or endpoint |
column_mappings | list | Column-to-property mappings |
sync_mode | SyncMode | FULL, INCREMENTAL, CDC |
update_frequency | UpdateFrequency | REALTIME, HOURLY, DAILY, MANUAL |
Column Mapping
{
"sourceColumn": "cust_email",
"targetProperty": "email",
"transformation": null,
"dataTypeConversion": "VARCHAR -> string"
}Schema Analysis
The analyzer inspects source schemas and suggests mappings to existing object types:
POST /v1/ontology/mappings/analyze
Request:
{
"datasourceType": "jdbc",
"connection": "crm-postgres",
"table": "customers",
"targetObjectType": "Customer"
}
Response:
{
"suggestions": [
{
"sourceColumn": "cust_id",
"targetProperty": "id",
"confidence": 0.95,
"matchType": "name_similarity"
},
{
"sourceColumn": "cust_email",
"targetProperty": "email",
"confidence": 0.90,
"matchType": "name_similarity"
}
],
"unmappedColumns": ["internal_flag", "legacy_code"],
"unmappedProperties": ["phone_number"]
}Sync Modes
| Mode | Description | Implementation |
|---|---|---|
FULL | Complete data refresh on each sync | Truncate and reload |
INCREMENTAL | Sync only changed records | Watermark-based extraction |
CDC | Real-time change capture | Flink CDC connector |
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/ontology/mappings | Create a datasource mapping |
| GET | /v1/ontology/mappings | List mappings for an object type |
| PUT | /v1/ontology/mappings/:id | Update mapping configuration |
| DELETE | /v1/ontology/mappings/:id | Remove a mapping |
| POST | /v1/ontology/mappings/analyze | Analyze source and suggest mappings |
| POST | /v1/ontology/mappings/:id/sync | Trigger manual data sync |
Related Pages
- Object Types -- Object type definitions
- Schema Validation -- Validate mappings
- Pipeline Service -- Pipeline execution