Stage 13: Data Catalogs
Stage 13 deploys data catalog services: OpenMetadata for metadata management and Apache Polaris for Iceberg catalog operations. It sets up prerequisite databases and secrets before deploying the Helm charts.
Source file: scripts/stages/13-data-catalogs.sh
Components Deployed
| Component | Purpose |
|---|---|
| OpenMetadata | Data catalog, lineage visualization, metadata management |
| Apache Polaris | Iceberg REST catalog for table management |
Prerequisites Setup
OpenMetadata Database
The stage creates the openmetadata_db database in the data plane PostgreSQL instance:
kubectl exec postgresql-0 -n matih-data-plane -- sh -c \
'PGPASSWORD=matih psql -U postgres -c "CREATE DATABASE openmetadata_db OWNER matih;"'Airflow Integration Secrets
OpenMetadata uses Airflow for metadata ingestion pipelines. The airflow-secrets Kubernetes secret provides the connection:
| Key | Description |
|---|---|
openmetadata-airflow-password | Airflow API authentication |
Deployment
# OpenMetadata
helm upgrade --install openmetadata open-metadata/openmetadata \
--namespace matih-data-plane \
--values infrastructure/helm/openmetadata/values-dev.yaml \
--wait --timeout 10m
# Polaris
helm upgrade --install polaris \
infrastructure/helm/polaris \
--namespace matih-data-plane \
--wait --timeout 5mLibraries Used
| Library | Purpose |
|---|---|
core/config.sh | Configuration access |
k8s/namespace.sh | Namespace management |
helm/deploy.sh | Helm deployment |
k8s/dev-secrets.sh | Dev secrets for database credentials |
Dependencies
- Requires:
05b-data-plane-infrastructure,12-workflow-orchestration - Required by:
16-data-plane-services
Dependency Verification
kubectl get pods -n matih-data-plane -l app=openmetadata
kubectl get pods -n matih-data-plane -l app=polaris