MATIH Platform is in active MVP development. Documentation reflects current implementation status.
8. Platform Services
Architecture

Platform Registry

The Platform Registry provides centralized management of all platform components, their versions, upgrade paths, and service discovery. It tracks the lifecycle of every component in the MATIH platform -- from Trino and Spark to Kafka and MLflow -- enabling controlled upgrades, dependency resolution, and service health aggregation across multi-tenant deployments.


Architecture

The Platform Registry runs as a Spring Boot 3.2 service on port 8084 with two primary controllers:

  • RegistryController -- Manages component registration, version lifecycle, upgrade path calculation, and upgrade execution orchestration
  • ServiceDiscoveryController -- Handles runtime service instance registration, discovery, health aggregation, and dependency graph management

The service persists data in PostgreSQL and publishes events to Kafka topics (registry.version-events and registry.upgrade-orchestration-events) for downstream consumers.


Core Concepts

ConceptDescription
PlatformComponentA registered platform technology (e.g., Trino, Kafka, MLflow) with category and status
ComponentVersionA specific version of a component with semantic versioning, LTS/recommended flags, and dependency tracking
UpgradePathA defined route from one version to another with risk assessment, duration estimates, and rollback procedures
UpgradeExecutionA tracked execution of an upgrade for a specific tenant, supporting rolling, canary, blue-green, and staged strategies
ServiceInstanceA runtime instance of a service registered for discovery with health monitoring
ServiceDependencyA declared dependency between services with type classification and health check configuration

Component Categories

Components are organized into categories that reflect platform architecture layers:

CategoryExamplesDescription
QUERY_ENGINETrino, Spark SQLSQL query execution engines
STREAMINGKafka, FlinkReal-time data streaming
STORAGEMinIO, Delta Lake, IcebergData storage and table formats
COMPUTESpark, RayDistributed compute engines
MLMLflow, Ray ServeMachine learning platforms
ORCHESTRATIONTemporal, AirflowWorkflow orchestration
CATALOGOpenMetadata, PolarisData catalog and metadata
OBSERVABILITYPrometheus, GrafanaMonitoring and observability
SECURITYVault, KeycloakSecurity and identity
INFRASTRUCTUREPostgreSQL, RedisCore infrastructure services

API Structure

The registry exposes two base paths:

Base PathControllerPurpose
/api/v1/registryRegistryControllerComponent and version management, upgrade paths, upgrade execution
/api/v1/servicesServiceDiscoveryControllerService instance registration, discovery, health, dependencies

Authentication

All endpoints require JWT authentication via the Authorization: Bearer header. Component management operations (registration, version updates, upgrade initiation) require platform administrator or tenant administrator roles.


Event-Driven Integration

The registry publishes events to Kafka for integration with other platform services:

TopicEvents
registry.version-eventsVERSION_REGISTERED, VERSION_STATUS_CHANGED, VERSION_DEPRECATED, VERSION_END_OF_LIFE, VERSION_END_OF_SUPPORT, RECOMMENDED_VERSION_CHANGED
registry.upgrade-orchestration-eventsUPGRADE_INITIATED, UPGRADE_STARTED, UPGRADE_VALIDATION_FAILED, CANARY_DEPLOYMENT_STARTED, CANARY_PROMOTED, BATCH_COMPLETED, UPGRADE_COMPLETED, UPGRADE_FAILED, UPGRADE_PAUSED, UPGRADE_RESUMED, UPGRADE_CANCELLED, ROLLBACK_INITIATED, ROLLBACK_COMPLETED, ROLLBACK_FAILED, UPGRADE_TIMEOUT

Caching

The registry uses Spring Cache for frequently accessed data:

Cache NameKeyContent
componentVersionscomponentIdAll versions for a component
recommendedVersionscomponentIdThe recommended version for a component
upgradePathsfromVersionId-toVersionIdCalculated upgrade routes between versions

Cache eviction occurs automatically when versions are registered, statuses are updated, or upgrade paths are modified.


Scheduled Tasks

TaskScheduleDescription
Version lifecycle processingDaily at 6:00 AMMarks versions reaching end-of-life, publishes end-of-support notifications
Canary monitoringEvery 30 secondsChecks health of active canary deployments, triggers auto-rollback on failure
Stale execution cleanupEvery 5 minutesMarks upgrade executions older than 24 hours as failed