Offline Store (Iceberg)
The offline store uses Apache Iceberg tables for historical feature storage, providing time travel queries, schema evolution, and point-in-time correct feature retrieval for training datasets.
Iceberg Configuration
class IcebergOfflineStore(OfflineStoreBackend):
def __init__(
self,
catalog_name: str = "default",
catalog_type: str = "hive",
warehouse_location: str = "s3://bucket/warehouse",
hive_metastore_uri: Optional[str] = None,
): ...Table Creation
When a feature view is registered with offline_enabled=True, an Iceberg table is automatically created:
table_id = await offline_store.create_table(
tenant_id="acme-corp",
feature_view="customer_features",
schema=[
FeatureField(name="total_purchases", dtype="float64"),
FeatureField(name="avg_order_value", dtype="float64"),
],
entity_keys=[EntityKey(name="customer_id", dtype="string")],
partition_by=["event_timestamp"],
)Tables are created in a tenant-specific database: features_{tenant_id}.{feature_view}.
Type Mapping
| MATIH Type | Iceberg Type |
|---|---|
int64 | long |
int32 | int |
float64 | double |
float32 | float |
string | string |
bool | boolean |
timestamp | timestamp |
bytes | binary |
Historical Feature Retrieval
Point-in-time correct retrieval finds the most recent feature value before a given timestamp:
results = await offline_store.read_historical(
table_id="...",
entity_keys=[{"customer_id": "cust-123"}],
timestamps=[datetime(2026, 1, 15)],
feature_names=["total_purchases", "avg_order_value"],
)Time Travel Queries
Iceberg's snapshot-based architecture enables querying the table as of any historical point:
historical_data = await offline_store.time_travel_query(
table_id="...",
as_of=datetime(2026, 1, 1),
filters={"is_premium": True},
)Point-in-Time Joins for Training
The PointInTimeJoinEngine retrieves training features with temporal correctness:
result = await store.get_training_features(
tenant_id="acme-corp",
entity_df=[
{"customer_id": "c1", "event_timestamp": "2026-01-15T10:00:00"},
{"customer_id": "c2", "event_timestamp": "2026-01-15T11:00:00"},
],
feature_refs=["customer_features:total_purchases", "customer_features:avg_order_value"],
event_timestamp_column="event_timestamp",
)
# result.records: [{customer_id, event_timestamp, customer_features:total_purchases, ...}]
# result.features_found: 4
# result.features_missing: 0Source Files
| File | Path |
|---|---|
| Iceberg Offline Store | data-plane/ml-service/src/features/iceberg_offline_store.py |
| Feast Offline Store | data-plane/ml-service/src/features/feast_offline_store.py |
| UnifiedFeatureStore | data-plane/ml-service/src/features/unified_feature_store.py |