MATIH Platform is in active MVP development. Documentation reflects current implementation status.
10. Data Catalog & Governance
Governance
Data Classification

Data Classification

Data classification labels data assets by sensitivity level and category, enabling automated policy enforcement. The Governance Service supports both manual and automatic classification with batch processing.


Classification Model

Each classification record captures the sensitivity and category of a data resource:

Sensitivity Levels

LevelDescription
PUBLICPublicly available, no restrictions
INTERNALInternal use only, general business data
CONFIDENTIALRestricted access, business-sensitive
RESTRICTEDHighly restricted, regulatory or PII/PHI/PCI

Data Categories

CategoryDescription
PIIPersonally Identifiable Information (name, SSN, email)
PHIProtected Health Information (medical records, diagnoses)
PCIPayment Card Industry (card numbers, CVV)
FINANCIALFinancial data (revenues, transactions)
BUSINESSGeneral business data
PUBLICPublic information

Create Classification

Manually classify a data resource:

POST /api/v1/governance/classifications
curl -X POST "http://localhost:8080/api/v1/governance/classifications" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -H "X-User-ID: 550e8400-e29b-41d4-a716-446655440099" \
  -d '{
    "resourceId": "warehouse.analytics.customer_data",
    "resourceType": "TABLE",
    "columnName": "social_security_number",
    "level": "RESTRICTED",
    "category": "PII",
    "confidence": 1.0,
    "justification": "Column contains SSN values confirmed by data steward"
  }'

Response

{
  "id": "cls-001",
  "tenantId": "550e8400-...",
  "resourceId": "warehouse.analytics.customer_data",
  "columnName": "social_security_number",
  "level": "RESTRICTED",
  "category": "PII",
  "confidence": 1.0,
  "classifiedBy": "550e8400-e29b-41d4-a716-446655440099",
  "verified": false,
  "createdAt": "2026-02-12T10:00:00Z"
}

Auto-Classify

Automatically classify a column based on its name and sample data patterns:

POST /api/v1/governance/classifications/auto
curl -X POST "http://localhost:8080/api/v1/governance/classifications/auto" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -d '{
    "resourceId": "warehouse.raw.customer_profiles",
    "columnName": "email_address",
    "sampleValues": ["john@example.com", "jane@company.org", "admin@test.net"]
  }'

Response

{
  "id": "cls-002",
  "resourceId": "warehouse.raw.customer_profiles",
  "columnName": "email_address",
  "level": "CONFIDENTIAL",
  "category": "PII",
  "confidence": 0.95,
  "classificationMethod": "AUTO",
  "patternMatched": "EMAIL_PATTERN",
  "createdAt": "2026-02-12T10:05:00Z"
}

Batch Auto-Classify

Classify multiple columns in a single request:

POST /api/v1/governance/classifications/auto/batch
curl -X POST "http://localhost:8080/api/v1/governance/classifications/auto/batch" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -d '[
    {
      "resourceId": "warehouse.raw.orders",
      "columnName": "credit_card_number",
      "sampleValues": ["4111111111111111", "5500000000000004"]
    },
    {
      "resourceId": "warehouse.raw.orders",
      "columnName": "billing_address",
      "sampleValues": ["123 Main St, City, ST 12345"]
    }
  ]'

Returns 202 Accepted -- batch classification runs asynchronously.


Query Classifications

Get Classifications for a Resource

GET /api/v1/governance/classifications?resourceId={resourceId}
curl "http://localhost:8080/api/v1/governance/classifications?resourceId=warehouse.raw.customer_profiles" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000"

Get by Sensitivity Level

GET /api/v1/governance/classifications/by-level?level=RESTRICTED&page=0&size=20

Get by Data Category

GET /api/v1/governance/classifications/by-category?category=PII

Verify Classification

A data steward can verify a classification as accurate:

POST /api/v1/governance/classifications/{classificationId}/verify
curl -X POST "http://localhost:8080/api/v1/governance/classifications/cls-002/verify" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -H "X-User-ID: 550e8400-e29b-41d4-a716-446655440099"

Update Classification

PUT /api/v1/governance/classifications/{classificationId}

Delete Classification

DELETE /api/v1/governance/classifications/{classificationId}

Classification Summary

Get an overview of classification coverage across the tenant:

GET /api/v1/governance/classifications/summary
curl "http://localhost:8080/api/v1/governance/classifications/summary" \
  -H "X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000"

Response

{
  "totalClassifications": 1247,
  "byLevel": {
    "PUBLIC": 320,
    "INTERNAL": 567,
    "CONFIDENTIAL": 289,
    "RESTRICTED": 71
  },
  "byCategory": {
    "PII": 145,
    "PHI": 23,
    "PCI": 18,
    "FINANCIAL": 89,
    "BUSINESS": 652,
    "PUBLIC": 320
  },
  "verifiedCount": 890,
  "unverifiedCount": 357,
  "autoClassifiedCount": 1023,
  "manualClassifiedCount": 224,
  "coveragePercent": 78.5
}

Source Reference

ComponentFile
Classification CRUDGovernanceController.java -- classification endpoints
Auto-classificationClassificationService.java -- autoClassify(), batchAutoClassify()
Classification modelDataClassification.java
Classification serviceClassificationService.java