MATIH Platform is in active MVP development. Documentation reflects current implementation status.
13. ML Service & MLOps
Feature Store Architecture

Feature Store Architecture

The MATIH Unified Feature Store provides an integrated platform for feature management, connecting feature declaration, lifecycle management, offline/online storage, materialization, streaming features, embedding features, and serving into a single cohesive system.


Architecture Overview

  +-------------------+     +--------------------+     +-------------------+
  | Feature           |     | Streaming Pipeline |     | Embedding Pipeline|
  | Declaration       |     | (Flink/Kafka)      |     | (OpenAI/Custom)   |
  +--------+----------+     +---------+----------+     +--------+----------+
           |                          |                         |
  +--------v----------+     +---------v----------+     +--------v----------+
  | Feature Registry  |     | Online Store       |     | Vector Index      |
  | (Lifecycle FSM)   +---->| (Redis/Aerospike)  |<----+ (Pinecone-style)  |
  +--------+----------+     +---------+----------+     +-------------------+
           |                          ^
  +--------v----------+               |
  | Offline Store     | Materialization
  | (Iceberg tables)  +---------------+
  +--------+----------+
           |
  +--------v----------+
  | Point-in-Time     |
  | Join Engine       |
  +-------------------+

Core Components

ComponentClassPurpose
UnifiedFeatureStoreMain entry pointConnects all feature store subsystems
RedisOnlineStoreOnline backendLow-latency feature serving via Redis
AerospikeOnlineStoreOnline backendHigh-throughput serving via Aerospike
IcebergOfflineStoreOffline backendHistorical features with time travel
MaterializationEngineETLMoves data from offline to online store
StreamingPipelineReal-timeFlink SQL-based streaming aggregations
EmbeddingPipelineVectorsFeature embedding and similarity search
PointInTimeJoinEngineTrainingPoint-in-time correct feature retrieval
FeatureServingLayerServingUnified lookup with online/offline fallback

Feature Lifecycle States

Features follow a state machine from declaration through production:

StateDescription
draftInitial declaration, not yet validated
validatingValidation in progress
pending_approvalAwaiting human approval
approvedApproved, ready for materialization
materializingInitial data load in progress
activeServing production traffic
suspendedTemporarily disabled
deprecatedMarked for removal
archivedFully decommissioned

Factory Function

from src.features.unified_feature_store import create_unified_feature_store
 
store = create_unified_feature_store(
    online_store_type="aerospike",
    aerospike_config={"hosts": [("aerospike", 3000)], "namespace": "features"},
    iceberg_config={"catalog_name": "hive", "warehouse_location": "s3://warehouse"},
    streaming_config={"kafka_bootstrap_servers": "kafka:9092"},
    embedding_config={"model_type": "openai", "dimension": 1536},
)
await store.initialize()

Section Contents

PageDescription
Feast IntegrationFeast registry, feature groups, materialization
Online StoreAerospike/Redis online serving
Offline StoreIceberg offline store with time travel
Streaming FeaturesReal-time feature computation
Embedding FeaturesVector embedding feature management
Feature VersioningVersion management and compatibility
Agentic InterfaceAgent-driven feature discovery

Source Files

FilePath
UnifiedFeatureStoredata-plane/ml-service/src/features/unified_feature_store.py
Feature Store APIdata-plane/ml-service/src/api/feature_api.py
Feast Registrydata-plane/ml-service/src/features/feast_registry_service.py