Upstream Lineage
Upstream lineage traces the origins of a data entity -- revealing which tables, views, and pipelines contribute data to a given target. This is essential for understanding data provenance and debugging data quality issues.
Get Upstream Lineage
Retrieve all direct upstream edges for an entity via the LineageController:
GET /v1/lineage/entity/{entityId}/upstream?tenantId={tenantId}curl "http://localhost:8086/v1/lineage/entity/550e8400-e29b-41d4-a716-446655440001/upstream?tenantId=550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer $TOKEN"Response
[
{
"id": "edge-001",
"tenantId": "550e8400-...",
"sourceEntityId": "tbl-raw-orders",
"sourceEntityFqn": "warehouse.raw.orders",
"sourceEntityType": "table",
"targetEntityId": "tbl-dim-orders",
"targetEntityFqn": "warehouse.analytics.dim_orders",
"targetEntityType": "table",
"lineageType": "DIRECT",
"lineageSource": "PIPELINE",
"pipelineId": "pipeline-etl-001",
"confidence": 1.0,
"description": "ETL pipeline: raw orders to dimension table",
"createdBy": "airflow-integration"
},
{
"id": "edge-002",
"sourceEntityId": "tbl-raw-customers",
"sourceEntityFqn": "warehouse.raw.customers",
"sourceEntityType": "table",
"targetEntityId": "tbl-dim-orders",
"targetEntityFqn": "warehouse.analytics.dim_orders",
"targetEntityType": "table",
"lineageType": "DIRECT",
"lineageSource": "QUERY_ANALYSIS",
"sqlQuery": "INSERT INTO dim_orders SELECT o.*, c.name FROM raw.orders o JOIN raw.customers c ON o.customer_id = c.id",
"confidence": 0.95
}
]Upstream Visualization Graph
For a richer upstream view with graph structure, use the visualization controller:
GET /api/v1/lineage/visualization/graph/{entityId}/upstream?maxDepth=5curl "http://localhost:8086/api/v1/lineage/visualization/graph/550e8400-e29b-41d4-a716-446655440001/upstream?maxDepth=5" \
-H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer $TOKEN"Response
{
"graph": {
"nodes": [
{
"id": "tbl-raw-orders",
"name": "raw.orders",
"type": "TABLE",
"depth": 1
},
{
"id": "tbl-raw-customers",
"name": "raw.customers",
"type": "TABLE",
"depth": 1
},
{
"id": "tbl-source-crm",
"name": "crm.contacts",
"type": "TABLE",
"depth": 2
}
],
"edges": [
{
"sourceId": "tbl-raw-orders",
"targetId": "tbl-dim-orders",
"lineageType": "DIRECT"
},
{
"sourceId": "tbl-raw-customers",
"targetId": "tbl-dim-orders",
"lineageType": "DIRECT"
},
{
"sourceId": "tbl-source-crm",
"targetId": "tbl-raw-customers",
"lineageType": "DIRECT"
}
]
},
"metadata": {
"entityId": "tbl-dim-orders",
"direction": "UPSTREAM",
"maxDepth": 5,
"actualDepth": 2,
"totalNodes": 3,
"totalEdges": 3
}
}Upstream Column Lineage
Trace the upstream origin of a specific column:
GET /v1/catalog/lineage/column/upstream?tableFqn={fqn}&columnName={col}&depth=5curl "http://localhost:8086/v1/catalog/lineage/column/upstream?tableFqn=warehouse.analytics.dim_orders&columnName=total_revenue&depth=5" \
-H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000"Response
{
"rootTable": "warehouse.analytics.dim_orders",
"rootColumn": "total_revenue",
"nodes": [
{
"tableFqn": "warehouse.raw.orders",
"columnName": "amount",
"depth": 1,
"transformation": "SUM(amount)"
},
{
"tableFqn": "warehouse.raw.orders",
"columnName": "discount",
"depth": 1,
"transformation": "SUM(amount) - SUM(discount)"
}
],
"edges": [
{
"sourceTable": "warehouse.raw.orders",
"sourceColumn": "amount",
"targetTable": "warehouse.analytics.dim_orders",
"targetColumn": "total_revenue",
"transformation": "SUM"
}
]
}Use Cases
- Data quality root cause analysis -- When a metric looks wrong, trace upstream to find the source tables and transformation that produced the incorrect value
- Regulatory compliance -- Prove the provenance of data used in financial reports or regulatory filings
- Impact assessment -- Before modifying a source table schema, understand which downstream assets depend on it
Source Reference
| Component | File |
|---|---|
| Upstream lineage endpoint | LineageController.java -- getUpstreamLineage() |
| Upstream visualization | LineageVisualizationController.java -- getUpstreamLineage() |
| Column upstream trace | ColumnLineageController.java -- traceUpstream() |
| Lineage service | LineageService.java |