Quality Scoring
The quality scoring system computes a unified quality score for each dataset by aggregating results from six quality dimensions. Scores are tracked over time, compared against SLA thresholds, and used to gate pipeline execution.
Source: data-plane/data-quality-service/src/scoring/calculator.py
Scoring Dimensions
| Dimension | Default Weight | Calculator |
|---|---|---|
| Completeness | 1.0 | CompletenessCalculator |
| Accuracy | 1.0 | AccuracyCalculator |
| Consistency | 0.8 | ConsistencyCalculator |
| Timeliness | 1.0 | TimelinessCalculator |
| Uniqueness | 0.8 | UniquenessCalculator |
| Validity | 0.9 | ValidityCalculator |
All calculators are defined in data-plane/data-quality-service/src/scoring/dimensions.py.
Score Calculation
The overall score is a weighted average of dimension scores:
overall_score = SUM(dimension_score * weight) / SUM(weight)Each dimension score ranges from 0.0 (worst) to 1.0 (perfect).
Completeness Score
Measures the proportion of non-null values across required columns:
completeness = 1 - (total_nulls / (row_count * critical_column_count))Accuracy Score
Measures the proportion of values passing range, pattern, and enum validation rules:
accuracy = passing_values / total_valuesTimeliness Score
Measures data freshness against the SLA threshold:
timeliness = 1.0 if (age_hours <= sla_hours) else max(0, 1 - (age_hours - sla_hours) / sla_hours)Score API
GET /v1/quality/scores?dataset=analytics.sales.transactions
Response:
{
"dataset": "analytics.sales.transactions",
"overallScore": 0.94,
"dimensions": {
"completeness": {"score": 0.99, "weight": 1.0},
"accuracy": {"score": 0.97, "weight": 1.0},
"consistency": {"score": 0.85, "weight": 0.8},
"timeliness": {"score": 1.0, "weight": 1.0},
"uniqueness": {"score": 0.92, "weight": 0.8},
"validity": {"score": 0.88, "weight": 0.9}
},
"slaStatus": "COMPLIANT",
"computedAt": "2026-02-12T06:30:00Z"
}SLA Compliance
SLA thresholds define minimum acceptable quality scores:
| SLA Level | Minimum Score | Behavior on Breach |
|---|---|---|
| Critical | 0.95 | Block pipeline, alert on-call |
| Standard | 0.80 | Alert dataset owner |
| Relaxed | 0.60 | Log warning only |
SLA Configuration
POST /v1/quality/sla
Request:
{
"dataset": "analytics.sales.transactions",
"overallMinScore": 0.90,
"dimensionMinScores": {
"completeness": 0.95,
"accuracy": 0.90,
"timeliness": 0.99
}
}Score Trends
Historical scores are stored for trend analysis:
GET /v1/quality/scores/trends?dataset=analytics.sales.transactions&days=30The trend API returns daily scores and identifies improving or degrading dimensions.
Pipeline Quality Gates
Pipelines integrate quality scores as execution gates:
quality_checks:
- name: pre_load_quality
type: quality_gate
dataset: analytics.sales.transactions
min_score: 0.90
dimensions:
completeness: 0.95
severity: criticalRelated Pages
- Validation Rules -- Rules feeding dimension scores
- Data Profiling -- Profile data for scoring
- Data Observability -- Score dashboards and alerts