Manufacturing & Supply Chain
End-to-end walkthroughs showing how a precision parts manufacturer uses the MATIH Platform to unify IoT sensor data, predict equipment failures, automate quality inspection, and optimize plant operations with AI-driven analytics.
Industry Context
Manufacturing generates more data per facility than almost any other industry -- yet most of it goes unused. A single CNC machine produces thousands of sensor readings per minute: vibration, temperature, spindle speed, coolant pressure, power draw. Multiply that by hundreds of machines across multiple plants, and you have a firehose of time-series data that rarely makes it past the historian database.
Meanwhile, production planning lives in SAP, quality records sit in a separate MES, supplier data arrives as CSV attachments, and energy consumption is tracked by yet another system. The result: maintenance teams react to breakdowns instead of preventing them, quality engineers catch defects after they reach customers, and plant managers piece together OEE numbers from spreadsheets.
The MATIH Platform bridges these silos. IoT sensor streams, ERP transactions, quality records, and supply chain data flow into a single governed environment. Data scientists build predictive maintenance models. ML engineers deploy real-time inspection systems. BI leads create plant floor dashboards. And executives ask natural language questions about cross-plant performance.
| Challenge | Impact | Platform Response |
|---|---|---|
| IoT data volume | 100M+ sensor readings/day across 200 machines, mostly unused | Streaming ingestion via Kafka, time-series aggregation, S3/DuckDB for efficient querying |
| Predictive maintenance | Unplanned downtime costs $10K-50K per hour per production line | ML Workbench for survival models, Ray Serve for real-time health scoring |
| Quality control | Manual inspection catches only 85% of defects, costs $2M/year in scrap | Visual inspection models, automated defect classification, quality gate pipelines |
| Supply chain visibility | Late supplier deliveries cause production schedule disruptions 3x/month | Federated queries across ERP, supplier portal, and logistics data |
| Energy optimization | Energy is 15-25% of production cost, with significant waste during idle periods | Real-time energy dashboards, ML-driven consumption optimization |
| Cross-plant standardization | Each plant uses different codes, naming conventions, and processes | Ontology Service for taxonomy standardization, semantic layer for unified metrics |
Company Profile: Apex Manufacturing
All walkthroughs in this section follow employees at Apex Manufacturing, a fictional precision parts manufacturer with the following profile:
| Attribute | Value |
|---|---|
| Annual Revenue | $320M |
| Products | Precision-machined aerospace and automotive components |
| Plants | 4 facilities across 2 states |
| Equipment | 200 CNC machines (lathes, mills, grinders, EDM) |
| Workforce | 1,200 production employees, 45 engineers |
| Shifts | 2 shifts (6am-2pm, 2pm-10pm), 5 days/week |
| Data Team | 3 data scientists, 2 ML engineers, 2 BI analysts, 1 COO |
| Key Customers | Tier 1 aerospace suppliers, automotive OEMs |
| Certifications | AS9100D (aerospace), IATF 16949 (automotive), ISO 14001 (environmental) |
Sample Datasets
These are the core datasets used across all four walkthroughs. In a production deployment, these tables live in their respective source systems and are ingested into the platform via Airbyte connectors, Kafka streaming, or file imports.
Sensor and Machine Data
| Dataset | Source | Rows | Key Columns |
|---|---|---|---|
sensor_readings | IoT Gateway (Kafka) | 100M | machine_id, sensor_type, value, unit, timestamp, quality_flag |
equipment_registry | SAP PostgreSQL | 5K | machine_id, machine_type, manufacturer, install_date, plant_id, line_id |
maintenance_logs | CMMS PostgreSQL | 200K | log_id, machine_id, maintenance_type, description, technician_id, duration_hours, parts_replaced, timestamp |
Production and Quality Data
| Dataset | Source | Rows | Key Columns |
|---|---|---|---|
production_orders | SAP PostgreSQL | 500K | order_id, product_id, machine_id, planned_qty, actual_qty, start_time, end_time, status |
quality_inspections | QMS PostgreSQL | 1M | inspection_id, order_id, machine_id, dimension_measured, spec_min, spec_max, actual_value, pass_fail, inspector_id |
defect_images | Image Metadata DB | 250K | image_id, inspection_id, defect_type, confidence_score, bounding_box, camera_id, timestamp |
Supply Chain and Energy Data
| Dataset | Source | Rows | Key Columns |
|---|---|---|---|
supplier_deliveries | Supplier Portal CSV | 300K | delivery_id, supplier_id, material_id, ordered_qty, delivered_qty, promised_date, actual_date, quality_grade |
energy_consumption | Smart Meters (Kafka) | 50M | meter_id, machine_id, plant_id, kwh, power_factor, demand_kw, timestamp |
bill_of_materials | SAP PostgreSQL | 85K | bom_id, parent_part, child_part, quantity_per, unit, lead_time_days |
Data Sources
Apex Manufacturing's data lives in six systems. The platform connects to all of them through the Ingestion Service (Airbyte connectors), Kafka streaming, and file imports.
┌──────────────────────────────────────────────────────────────────────┐
│ MATIH Ingestion Layer │
│ (Airbyte Connectors + Kafka + File Import) │
└──────┬────────┬───────────┬──────────┬──────────┬──────────┬────────┘
│ │ │ │ │ │
┌─────▼────┐ ┌─▼────────┐ ┌▼────────┐ ┌▼───────┐ ┌▼──────┐ ┌▼───────┐
│ SCADA / │ │ SAP ERP │ │ IoT │ │ QMS │ │ CSV │ │ Smart │
│Historian │ │PostgreSQL│ │ Gateway │ │Postgres│ │Supplier│ │ Meters │
│(OPC-UA) │ │ │ │(MQTT→ │ │ │ │ Portal│ │(Kafka) │
│ │ │equipment │ │ Kafka) │ │quality │ │ │ │ │
│ machine │ │production│ │ sensor │ │inspect │ │deliver│ │ energy │
│ status │ │orders │ │readings │ │defects │ │ies │ │ usage │
│ alarms │ │materials │ │ │ │ │ │ │ │ │
└──────────┘ └──────────┘ └─────────┘ └────────┘ └───────┘ └────────┘| Source | Connector Type | Sync Mode | Frequency |
|---|---|---|---|
| SCADA / Historian (OPC-UA) | Airbyte custom connector | Incremental (timestamp) | Every 5 minutes |
| SAP ERP PostgreSQL | Airbyte PostgreSQL connector | CDC (incremental) | Every 15 minutes |
| IoT Gateway (MQTT to Kafka) | Kafka streaming ingestion | Streaming (real-time) | Continuous |
| Quality Management System | Airbyte PostgreSQL connector | CDC (incremental) | Every 15 minutes |
| Supplier Portal exports | File Import (Data Workbench) | Full refresh | Weekly |
| Smart Energy Meters | Kafka streaming ingestion | Streaming (real-time) | Continuous (1-min intervals) |
Data Flow Architecture
Apex Manufacturing Data Flow
┌─────────────────────────────────────────────────────────────────────┐
│ STREAMING LAYER │
│ │
│ IoT Sensors ──▶ MQTT ──▶ Kafka ──▶ 5-min Aggregation ──▶ S3 │
│ Energy Meters ──▶ Kafka ──▶ 1-min Aggregation ──▶ S3 │
└──────────────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────────┐
│ PLATFORM DATA LAYER │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Catalog │ │ Query │ │ Quality │ │ Governance │ │
│ │ Service │ │ Engine │ │ Service │ │ Service │ │
│ │ │ │ (Trino + │ │ (GX) │ │ (access control, │ │
│ │ 47 sensor│ │ DuckDB) │ │ │ │ audit trails) │ │
│ │ types │ │ │ │ sensor │ │ │ │
│ │ profiled │ │ time- │ │ quality │ │ │ │
│ │ │ │ series │ │ gates │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────────┘ │
└──────────────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────┼──────────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌───────────────────┐
│ ML Workbench │ │ BI Workbench │ │ Agentic Workbench │
│ │ │ │ │ │
│ Predictive │ │ OEE │ │ "Why did Plant 2 │
│ maintenance, │ │ dashboards, │ │ OEE drop last │
│ quality │ │ energy │ │ week?" │
│ models │ │ analytics │ │ │
└──────────────┘ └──────────────┘ └───────────────────┘Business KPIs
Apex Manufacturing tracks these key performance indicators across all workbenches and dashboards. Each walkthrough shows how the platform computes, monitors, and acts on these metrics.
Equipment Performance
| KPI | Definition | Current Value | Target |
|---|---|---|---|
| OEE (Overall Equipment Effectiveness) | Availability x Performance x Quality | 72.4% | > 85% (world-class) |
| Availability Rate | Run time / planned production time | 88.1% | > 95% |
| Performance Rate | Actual output / theoretical max output | 91.2% | > 95% |
| Quality Rate | Good parts / total parts produced | 90.1% | > 99% |
| MTBF (Mean Time Between Failures) | Average operating hours between unplanned stops | 340 hours | > 500 hours |
| MTTR (Mean Time To Repair) | Average hours to restore machine to production | 4.2 hours | < 2 hours |
Production and Quality
| KPI | Definition | Current Value | Target |
|---|---|---|---|
| First-Pass Yield | % of parts meeting spec on first inspection | 94.3% | > 98% |
| Scrap Rate | Scrap cost / total production cost | 3.8% | < 2% |
| On-Time Delivery | % of orders delivered by promised date | 91.7% | > 98% |
| Cycle Time Variance | Actual cycle time vs standard (std dev) | +/- 12% | +/- 5% |
Supply Chain and Cost
| KPI | Definition | Current Value | Target |
|---|---|---|---|
| Supplier OTD | % of supplier deliveries on time | 87.3% | > 95% |
| Energy per Unit | kWh consumed per part produced | 2.8 kWh | < 2.2 kWh |
| Maintenance Cost per Unit | Total maintenance spend / parts produced | $1.42 | < $1.00 |
| Inventory Days of Supply | Raw material inventory / daily consumption | 18 days | 10-12 days |
Persona Walkthroughs
Each walkthrough follows one persona through all eight lifecycle stages, using real Apex Manufacturing data and scenarios. Start with the role closest to yours, or read all four to see how the platform enables cross-functional collaboration.
| Walkthrough | Persona | Scenario | Primary Workbenches |
|---|---|---|---|
| Data Scientist Journey | Lin Wei, Senior Data Scientist | Building a predictive maintenance model for 200 CNC machines to reduce unplanned downtime | ML Workbench, Data Workbench |
| ML Engineer Journey | Tomas Rivera, ML Engineer | Deploying an automated visual inspection system for real-time defect detection on the production line | ML Workbench, Pipeline Service |
| BI Lead Journey | Carlos Mendez, BI Lead | Creating the plant performance and OEE analytics platform for 4 facilities | BI Workbench, Semantic Layer |
| Executive Leadership Journey | Karen Singh, COO | Using AI-assisted analytics for cross-plant operational strategy and capital planning | Agentic Workbench, BI Dashboards |
How the Walkthroughs Connect
These four personas work on the same data at Apex Manufacturing. Their work products feed into each other:
Lin Wei (Data Scientist) Tomas Rivera (ML Engineer)
┌──────────────────────┐ ┌──────────────────────┐
│ Predictive maint. │ │ Visual quality │
│ model (C-index 0.84) │─────────────▶│ inspection pipeline │
│ │ model │ │
│ Sensor feature │ registry │ Edge deployment │
│ engineering │ │ P99 < 200ms │
└──────────┬───────────┘ └──────────┬───────────┘
│ health scores │ defect rates
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Carlos (BI Lead) │ │ Karen (COO) │
│ │◀─────────────│ │
│ OEE dashboards, │ dashboard │ Cross-plant strategy,│
│ maintenance cost │ access │ capital planning, │
│ tracking │ │ scenario analysis │
└──────────────────────┘ └──────────────────────┘Lin Wei's predictive maintenance scores feed into Carlos's equipment health dashboards. Tomas's visual inspection system provides real-time quality data that drives the Quality Rate component of OEE. Carlos's dashboards give Karen the operational visibility she needs for capital planning decisions. The semantic layer ensures all four personas use the same OEE, MTBF, and quality metric definitions.
Prerequisites
Before following these walkthroughs, ensure you have:
- A running MATIH Platform instance (see Installation)
- The Apex Manufacturing sample dataset loaded (available in the platform's sample data catalog)
- Completed the Quickstart Tutorials for the workbenches you plan to use
Related Chapters
- Data Ingestion -- Configuring Airbyte connectors, Kafka streaming, and file imports
- Query Engine -- SQL federation and DuckDB for time-series analytics
- Data Catalog -- Metadata management, profiling, and lineage
- Pipelines -- Temporal-based orchestration for streaming and batch
- ML Service -- Model training, registry, and Ray Serve deployment
- AI Service -- Text-to-SQL and multi-agent chat