Catalog Service Architecture
Production - Core catalog APIs, search, lineage, classification
The Catalog Service is a Java 21 / Spring Boot 3.2 application that serves as the metadata backbone of the MATIH platform. It provides APIs for browsing, searching, and managing catalog entities across all tenant data sources.
Service Overview
| Property | Value |
|---|---|
| Language | Java 21 |
| Framework | Spring Boot 3.2 |
| Port | 8086 |
| Namespace | matih-data-plane |
| Base path | /api/v1/catalog, /api/v1/datasources, /v1/lineage, /v1/classification |
| Authentication | JWT via X-Tenant-ID header |
| Build tool | Gradle |
High-Level Architecture
+-------------------------------------------------------------------+
| Catalog Service |
| |
| +------------------+ +------------------+ +-----------------+ |
| | CatalogController| | DataSourceCtrl | | DiscoveryCtrl | |
| | /api/v1/catalog | | /api/v1/ | | /api/v1/catalog | |
| | | | datasources | | /discovery | |
| +--------+---------+ +--------+---------+ +--------+--------+ |
| | | | |
| +--------v----------------------v---------------------v--------+ |
| | CatalogService | |
| | - Database browsing - Table management | |
| | - Tag operations - Statistics | |
| +----------------------------+---------------------------------+ |
| | |
| +----------------------------v---------------------------------+ |
| | Supporting Services | |
| | | |
| | CatalogSearchService MetadataIngestionService | |
| | ClassificationService LineageService | |
| | DataGlossaryService GovernancePolicyService | |
| +------+----------+----------+-----------+---------------------+ |
| | | | | |
| +------v---+ +----v----+ +--v-------+ +-v-----------+ |
| |PostgreSQL| |Elastic | |OpenMeta | |Kafka | |
| | | |Search | |data | |Events | |
| +----------+ +---------+ +----------+ +-------------+ |
+-------------------------------------------------------------------+Controller Overview
| Controller | Path | Responsibility |
|---|---|---|
CatalogController | /api/v1/catalog | Search, databases, tables, tags, lineage, statistics |
DataSourceController | /api/v1/datasources | Data source registration, CRUD, ingestion triggers |
CatalogDiscoveryController | /api/v1/catalog/discovery | Trending assets, related assets, browse hierarchy, recommendations |
LineageVisualizationController | /api/v1/lineage/visualization | Graph visualization, impact analysis, path finding, export |
LineageController | /v1/lineage | Edge management, traversal, column lineage, OpenLineage |
ColumnLineageController | /v1/catalog/lineage/column | Column-level lineage extraction and queries |
ClassificationController | /v1/classification | Table classification, PII/PHI/PCI discovery, rules |
Entity Model
The Catalog Service manages five core entity types:
// CatalogDatabase - Represents a database in a data source
@Entity
public class CatalogDatabase {
private UUID id;
private UUID tenantId;
private String name;
private String fullyQualifiedName;
private String description;
private int tableCount;
private UUID dataSourceId;
}
// CatalogTable - Represents a table with schema information
@Entity
public class CatalogTable {
private UUID id;
private UUID tenantId;
private String name;
private String fullyQualifiedName;
private String schemaName;
private String catalogName;
private TableType tableType;
private String description;
private int columnCount;
private List<String> tags;
}
// CatalogDataSource - Registered data source
@Entity
public class CatalogDataSource {
private UUID id;
private UUID tenantId;
private String name;
private String type; // postgresql, mysql, snowflake, etc.
private String connectionConfig;
private boolean active;
}
// CatalogTag - Tag for asset organization
@Entity
public class CatalogTag {
private UUID id;
private UUID tenantId;
private String name;
private TagCategory category; // BUSINESS, TECHNICAL, CLASSIFICATION, CUSTOM
}
// CatalogLineage - Lineage relationship between entities
@Entity
public class CatalogLineage {
private UUID id;
private UUID tenantId;
private UUID sourceEntityId;
private UUID targetEntityId;
private String lineageType;
}Configuration
The Catalog Service is configured through Spring Boot application properties:
# application.yml
server:
port: 8086
spring:
datasource:
url: jdbc:postgresql://${DB_HOST}:5432/${DB_NAME}
username: ${DB_USER}
password: ${DB_PASSWORD}
elasticsearch:
uris: ${ELASTICSEARCH_URL:http://elasticsearch:9200}
kafka:
bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS}
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
openmetadata:
url: ${OPENMETADATA_URL:http://openmetadata:8585}
auth-token: ${OPENMETADATA_TOKEN}Multi-Tenancy
All catalog operations are tenant-scoped. The X-Tenant-ID header is required on every request and ensures strict data isolation:
@GetMapping("/databases")
public ResponseEntity<Page<CatalogDatabase>> listDatabases(
@RequestHeader("X-Tenant-ID") UUID tenantId,
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size) {
Pageable pageable = PageRequest.of(page, size);
Page<CatalogDatabase> databases = catalogService.listDatabases(tenantId, pageable);
return ResponseEntity.ok(databases);
}Next Steps
- Search & Autocomplete -- full-text search across catalog entities
- Data Sources -- registering and managing data connections
- Metadata Ingestion -- synchronizing metadata from external sources