MinIO
MinIO provides S3-compatible object storage for development and staging environments. In production, it is replaced by Azure Blob Storage, AWS S3, or Google Cloud Storage.
Configuration
# From matih-data-plane/values.yaml
global:
storage:
type: "minio"
s3:
endpoint: "http://minio.matih-data-plane.svc.cluster.local:9000"
region: "us-east-1"Storage Buckets
| Bucket | Purpose | Consumers |
|---|---|---|
| curated-data | Iceberg tables, curated datasets | Trino, Spark, Pipeline Service |
| raw-data | Raw ingested data | Pipeline Service |
| ml-artifacts | MLflow model artifacts | ML Service, MLflow |
| spark-history | Spark event logs | Spark History Server |
| airflow-logs | Airflow task logs | Airflow |
Secret Management
Even for dev MinIO, credentials use secretKeyRef:
# Correct: Reference from secret
artifactStore:
s3:
existingSecret: mlflow-s3-credentials
accessKeyIdKey: aws-access-key-id
secretAccessKeyKey: aws-secret-access-keyDev vs Production
| Aspect | Dev (MinIO) | Production |
|---|---|---|
| Provider | MinIO StatefulSet | Azure Blob / AWS S3 / GCS |
| Endpoint | minio.matih-data-plane.svc:9000 | Cloud-native endpoint |
| Authentication | K8s Secret | Workload Identity / IRSA |
| Replication | None | Cross-region replication |