Stage 12: Workflow Orchestration
Stage 12 deploys Apache Airflow for DAG-based pipeline orchestration. It sets up Airflow secrets (fernet key, API server key), PostgreSQL database, and deploys the Airflow Helm chart with web server, scheduler, and worker components.
Source file: scripts/stages/12-workflow-orchestration.sh
Components Deployed
| Component | Purpose |
|---|---|
| Airflow Web Server | DAG management UI and REST API |
| Airflow Scheduler | DAG parsing and task scheduling |
| Airflow Worker | Task execution (CeleryExecutor or KubernetesExecutor) |
| Airflow Triggerer | Deferrable operator support |
| Airflow Database | PostgreSQL (from Stage 05b) |
Secret Setup
The stage creates the following secrets if they do not already exist:
| Secret | Contents | Purpose |
|---|---|---|
airflow-fernet-key | Fernet encryption key | Encrypt connection passwords in Airflow metadata DB |
airflow-api-server-secret | Random hex string | API server authentication |
Fernet Key Generation
python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"Deployment
helm upgrade --install airflow apache-airflow/airflow \
--namespace matih-data-plane \
--values infrastructure/helm/airflow/values.yaml \
--values infrastructure/helm/airflow/values-dev.yaml \
--wait --timeout 15mDatabase Setup
Airflow uses the PostgreSQL instance deployed in Stage 05b. The database connection is configured via the airflow-database-secret Kubernetes secret, never hardcoded in values files.
Libraries Used
| Library | Purpose |
|---|---|
core/config.sh | Terraform output access |
k8s/namespace.sh | Namespace management |
helm/repo.sh | Helm repository management |
helm/deploy.sh | Deployment functions |
k8s/dev-secrets.sh | Dev environment secret creation |
Dependencies
- Requires:
05b-data-plane-infrastructure,06-ingress-controller - Required by:
13-data-catalogs
Dependency Verification
kubectl get pods -n matih-data-plane -l component=webserver
kubectl get pods -n matih-data-plane -l component=scheduler