Data Masking
Data masking protects sensitive information by transforming column values in query results before they are returned to the user. The QueryResultMaskingService applies masking rules based on data classifications from the governance service, ensuring that unauthorized users see obfuscated data while authorized users see the original values.
Masking Process
Query Results QueryResultMaskingService GovernanceServiceClient
| | |
|--- Raw results -------->| |
| |--- Get masking rules ------->|
| |<-- Classification rules -----|
| | |
| |--- Apply masking per column |
| |--- Check user exemptions |
| | |
|<-- Masked results ------| |Masking Types
| Type | Description | Example |
|---|---|---|
FULL | Replace entire value with mask characters | *** |
PARTIAL | Mask portion of the value | john.***@example.com |
HASH | Replace with deterministic hash | a1b2c3d4 |
REDACT | Remove value entirely | [REDACTED] |
TOKENIZE | Replace with reversible token | tok_abc123 |
NULLIFY | Replace with null | null |
CUSTOM | Apply custom masking function | Defined per classification |
Masking Rules
Masking rules are derived from data classifications in the governance service:
| Classification Property | Effect |
|---|---|
requireMasking = true | Column is masked for non-exempt users |
maskingType | Determines the masking algorithm |
sensitivityLevel | Higher levels use stronger masking |
allowedRoles | Users with these roles see unmasked data |
Exemptions
Users may be exempt from masking based on:
| Criteria | Description |
|---|---|
| Role | Users with DATA_ADMIN or governance roles see unmasked data |
| Purpose | Specific access purposes may grant temporary exemption |
| Justification | Time-limited access with documented justification |
Performance Impact
- Masking is applied post-query, so it does not affect query execution time
- The masking service processes results in a streaming fashion to minimize memory overhead
- Classification lookups are cached with configurable TTL
- Masking of large result sets adds proportional processing time
Configuration
Masking behavior is configured at the classification level through the governance service. The Query Engine does not manage masking rules directly; it consumes them from the governance service via the GovernanceServiceClient.