MATIH Platform is in active MVP development. Documentation reflects current implementation status.
2. Architecture
Data Stores
Object Storage

Object Storage

MinIO provides S3-compatible object storage for the MATIH Platform. It stores ML model artifacts, pipeline outputs, exported reports, dashboard thumbnails, and any file-based data that does not fit in relational or key-value stores.


Role in the Platform

AspectDetails
TechnologyMinIO
API compatibilityAmazon S3 API
DeploymentKubernetes StatefulSet via Helm
AuthenticationAccess key / secret key via Kubernetes secrets
Multi-tenancyPer-tenant bucket or prefix-based isolation

Use Cases

Use CaseBucket PatternWritten ByRead By
ML model artifactsmlflow-artifacts/{tenant_id}/ML ServiceML Service, Ray Serve
Pipeline outputspipeline-outputs/{tenant_id}/Pipeline ServiceData Workbench, BI Service
Exported reportsexports/{tenant_id}/Render ServiceBI Workbench
Dashboard thumbnailsthumbnails/{tenant_id}/Render ServiceBI Workbench
Data lake (Iceberg)lakehouse/{tenant_id}/Pipeline Service, SparkTrino (Iceberg connector)
Backup archivesbackups/{tenant_id}/Backup jobsRecovery procedures

Bucket Organization

MinIO Instance
  +-- mlflow-artifacts/
  |     +-- acme-corp/
  |     |     +-- experiment-1/
  |     |     +-- experiment-2/
  |     +-- globex/
  |
  +-- pipeline-outputs/
  |     +-- acme-corp/
  |     +-- globex/
  |
  +-- lakehouse/
  |     +-- acme-corp/
  |     |     +-- orders/
  |     |     +-- customers/
  |     +-- globex/
  |
  +-- exports/
        +-- acme-corp/
        +-- globex/

Multi-Tenancy

Tenant isolation is enforced through bucket policies and prefix-based access control:

StrategyImplementation
Prefix isolationObjects scoped to {tenant_id}/ prefix within shared buckets
Access controlService accounts with IAM policies restricting to tenant prefix
EncryptionServer-side encryption with per-tenant keys (SSE-KMS)

Configuration

ParameterDevelopmentProduction
Storage classLocal PVCloud block storage
Erasure codingDisabledEnabled (EC:4)
Replication1 drive4+ drives across nodes
Lifecycle rules30-day expiry for temp objectsConfigurable per bucket

Integration with Trino

The Iceberg connector in Trino reads data from MinIO:

Trino --> Iceberg Connector --> MinIO (S3 API)
  |
  +-- Catalog: iceberg
  +-- Schema: sales
  +-- Table: orders
  +-- Data files: s3://lakehouse/acme-corp/orders/data/*.parquet

Related Pages

  • PostgreSQL -- Primary relational store
  • Trino -- Query federation using MinIO data
  • ML Flow -- ML artifact storage