Why this matters
Certification and quality badges make trust visible. As a Data Platform Engineer, you enable teams to find reliable datasets quickly, reduce risk, and speed up delivery. Badges encode clear, testable criteria (freshness, test pass rates, documentation, lineage, access controls) so consumers know if a dataset is safe for critical use.
- Support governed self-serve: consumers pick Certified/Gold data with confidence.
- Lower risk: surface contract breaks, late data, or missing ownership before they hit dashboards.
- Operational clarity: standardize what "good" means across domains, with auditable decisions.
Concept explained simply
Think of a dataset certification as a signed stamp: someone accountable verified it meets agreed standards. Quality badges are specific labels that reflect properties (e.g., Freshness: Good, Documentation: Complete, PII: Present, Tests: 98% pass).
Mental model
Use a "driver license + dashboard lights" model: certification is the license to drive in production; badges are the dashboard lights signaling health (green/amber/red) for aspects like tests and freshness.
Typical trust levels
- Bronze: Raw/landing. Minimal guarantees, exploratory only.
- Silver: Cleaned/conformed. Basic tests, documented schema, domain owner assigned.
- Gold (Certified): Business-ready. SLO-backed freshness, strong tests, lineage verified, runbook and approvals in place.
What goes into a badge (objective, testable criteria)
- Ownership: primary owner and on-call contact defined.
- Documentation: business description, field-level docs for top fields, SLA/SLO statement.
- Freshness: data delay within SLO (e.g., 95% of days under 30 minutes late).
- Quality tests: minimum coverage and pass rate (e.g., required tests for not_null, unique keys, referential integrity; 7-day pass rate ≥ 98%).
- Schema stability: backward-compatible changes only, change log present.
- Lineage: upstream and downstream mapped and visible.
- Access & privacy: PII flagged, access policy applied, approvals enforced.
- Reliability history: no critical incidents in last N days (e.g., 14) or documented mitigations.
Example numeric thresholds
- Bronze: docs present, owner set.
- Silver: freshness ≤ 6h p95; tests ≥ 90% pass; lineage present.
- Gold: freshness ≤ 1h p95; tests ≥ 98% pass; incident-free 14 days; PII policy enforced; runbook.
Workflow: from request to badge
- Request: dataset owner submits a certification request with evidence (metrics screenshots, test run links, SLOs).
- Automated checks: platform gathers freshness, test pass rate, schema diff, lineage, doc completeness.
- Human review: data steward/peer reviewer validates business definition, risk, access policy.
- Decision: approve level (Bronze/Silver/Gold) or reject with remediation tasks.
- Publish: badge appears in the catalog with criteria, date, approver, and expiry/review date.
- Monitor: nightly jobs evaluate criteria; auto-downgrade or flag when thresholds fail; notify owners.
- Re-certify: periodic review (e.g., quarterly) or on major changes.
Governance guardrails
- Four-eyes principle: owner cannot self-approve Gold.
- Audit trail: store request, evidence, decision, and timestamps.
- Expiry dates: prevent forgotten certifications.
Worked examples
Example 1: Promoting a sales KPI table to Gold
- Dataset: mart_sales.daily_revenue
- Evidence: freshness p95 = 12 min; tests pass 99%; lineage fully mapped; PII: none; incidents: 0 in 30 days; owner and runbook present.
- Decision: Gold (Certified)
- Published badges: Certified, Freshness: Good, Tests: Strong, Lineage: Verified, Docs: Complete.
Example 2: Auto-downgrade after SLO breach
- Dataset: mart_marketing.campaign_costs
- Event: pipeline failure causes 2 days with 10+ hour delay; p95 freshness > 6h.
- Action: badge downgraded from Gold to Silver; catalog shows warning and remediation ticket.
- Re-certify: after fix and 14-day stable run, request upgrade back to Gold.
Example 3: Partial badges only
- Dataset: domain.customer360
- Status: tests 95% pass, PII tagged and masked, lineage verified, but field-level docs incomplete.
- Decision: Silver with badges: PII: Governed, Lineage: Verified; Docs badge shows Incomplete. Not Certified until documentation meets standard.
Who this is for
- Data Platform Engineers and Analytics Engineers who manage catalogs and pipelines.
- Data Stewards and Product Owners who define trust standards.
Prerequisites
- Basic SQL and data modeling knowledge.
- Familiarity with data pipeline orchestration and testing concepts.
- Understanding of your organization’s data access policies.
Learning path
- Define trust levels and measurable criteria.
- Automate metrics collection (freshness, tests, lineage, docs).
- Design the approval workflow with roles and audit trail.
- Roll out badges incrementally (pilot domain, then scale).
- Monitor, auto-downgrade, and re-certify on schedule.
Checklist: before granting Certified/Gold
- Owner and on-call set
- Business description and field docs complete
- Freshness SLO met for the last 14 days
- Required tests ≥ 98% pass for the last 7 days
- Lineage mapped upstream and downstream
- Schema changes reviewed and logged
- PII flagged and access policies applied
- Runbook with rollback steps attached
- Independent review completed
Exercises
Do these mini tasks to solidify your understanding. They mirror the graded exercises below.
Exercise 1: Draft your certification rubric
Create Bronze/Silver/Gold criteria and the approval workflow. Include numeric thresholds and roles.
- Deliverable: a one-page rubric with thresholds and review steps.
- Tip: keep criteria objective and tool-agnostic.
Exercise 2: Decide the badge from evidence
Given metrics for a dataset, choose the badge and list remediation.
- Evidence: freshness p95=80 min; tests pass=96%; PII=none; docs=complete; incidents: 0 in 10 days.
- Question: Silver or Gold? Why?
Common mistakes and self-check
- Vague criteria: Fix by adding numbers (e.g., "tests pass ≥ 98%", not "good tests").
- One-time certification: Fix by setting expiry and continuous checks.
- Invisible process: Fix by recording decisions and showing badge rationale in the catalog.
- Ignoring PII: Fix by mandating privacy scans before certification.
- No rollback: Fix by defining auto-downgrade rules and notifications.
Self-check prompts
- Can a new analyst understand exactly why a dataset is Certified?
- Would two reviewers make the same decision from your rubric?
- What happens tonight if freshness SLO is missed?
Practical projects
- Project 1: Implement nightly freshness and test-check jobs that update badge statuses in your catalog.
- Project 2: Build a certification request form and an approval checklist (stored with the dataset metadata).
- Project 3: Configure auto-downgrade and owner notifications when criteria fail, plus a weekly summary report.
Next steps
- Pilot in one domain, gather feedback, and refine thresholds.
- Publish your rubric and examples organization-wide.
- Schedule quarterly re-certification and add it to your on-call/runbook.
Mini challenge
Your Gold dataset shows tests pass=97% for the last 7 days due to two transient nulls on a key column. Do you auto-downgrade or grant a temporary exception? Decide, justify using your rubric, and write the catalog note you would post.
Ready for the Quick Test?
Take the Quick Test below. Everyone can take it for free; log in to save your progress.