Why this matters
BI lives or dies on trust. Business users expect dashboards to be correct, complete, and fresh. As a BI Developer, you set crystal-clear data quality expectations so teams can rely on insights to make decisions, forecast, and take action.
Real tasks you will do
- Define acceptance criteria for key dashboards (e.g., daily sales must be loaded by 8:00 AM).
- Monitor freshness and row-count drift to catch missing data early.
- Set thresholds for nulls, duplicates, and invalid values in core dimensions.
- Align with data producers on data contracts and schema changes.
- Communicate incidents, impacts, and ETAs for fixes.
Concept explained simply
Data quality expectations are the explicit rules that define when data is good enough to use. They set target levels, acceptable ranges, and alert conditions for the most important datasets and dashboards.
Mental model
Think of data quality like a service: you measure it with SLIs (what you observe), commit with SLOs (your target), and promise with SLAs (formal agreements). Your job is to define, measure, and communicate these clearly.
- SLI (Service Level Indicator): The metric you track (e.g., % of orders with valid customer_id).
- SLO (Service Level Objective): The target (e.g., ≥ 99.5% valid).
- Alert: The trigger when performance drops (e.g., 3-hour delay beyond 8:00 AM load).
Core expectations and metrics
Use simple, measurable dimensions:
- Accuracy: Values match reality (e.g., revenue equals invoiced amounts).
- Completeness: Required fields are filled; no missing rows.
- Consistency: Same definition across tables and time.
- Validity: Values meet business rules and data types.
- Uniqueness: No duplicate keys where there shouldn’t be.
- Freshness: Data arrives on time; low lag vs. source.
Practical SLI/SLO templates you can reuse
- Freshness SLI: Data lag (minutes) for fact_sales. SLO: ≤ 60 min by 8:00 AM local, 7 days/week.
- Completeness SLI: % non-null customer_id in fact_sales. SLO: ≥ 99.8%.
- Validity SLI: % orders with order_date ≤ delivered_date. SLO: ≥ 99.9%.
- Uniqueness SLI: Duplicate rate on order_id. SLO: ≤ 0.05%.
- Consistency SLI: Revenue reconciliation delta vs. finance ledger. SLO: ≤ 0.3% daily.
Worked examples
Example 1: Daily Sales Dashboard
- Expectation: Dashboard updated by 8:00 AM; sales figures within ±0.5% of point-of-sale totals.
- Checks: Row count vs. last 7 days (±20% tolerance), % null customer_id ≤ 0.2%, data lag ≤ 60 min.
- Alert: If lag > 60 min at 8:30 AM or nulls > 0.2%, page banner shows "Data delayed" with ETA.
Example 2: Monthly Recurring Revenue (MRR)
- Expectation: MRR matches billing ledger within 0.3% at month-end close.
- Checks: Reconciliation query on MRR; uniqueness on subscription_id; active status validity window.
- Alert: If delta > 0.3%, pause distribution list and open incident with finance.
Example 3: Inventory Stockouts
- Expectation: SKU-level inventory freshness ≤ 15 minutes; negative inventory count = 0.
- Checks: Freshness SLI by warehouse; validity rule inventory_count ≥ 0; drift check on restock events.
- Alert: If freshness > 15 minutes for any warehouse, send notification to ops channel.
How to set expectations (step-by-step)
- List critical dashboards and the questions they answer.
- Identify the few metrics that must be right (SLIs).
- Set SLO targets and tolerances with business owners.
- Define checks at each layer: source, pipeline, transform, BI.
- Choose alert triggers and who gets notified.
- Document expectations in a one-pager all can read.
- Review quarterly; tighten or adjust thresholds.
Checklist: Is your expectation ready?
- SLI is measurable and unambiguous.
- SLO has a numeric target and timeframe.
- Owner is named for the dataset and the alert.
- Incident path is clear: detect → triage → communicate → fix → learn.
- Reconciliation query exists for money-critical metrics.
Monitoring and alerts
Balance sensitivity and noise. Too many alerts cause alert fatigue; too few cause surprises.
- Static thresholds for hard rules (e.g., duplicates must be 0 for primary keys).
- Adaptive thresholds for volume-based checks (e.g., row counts ±30% vs. 14-day median).
- Aggregate alerts: require 2 consecutive failures before paging.
- Visualization: add small status badges near dashboards (Fresh, Delayed, Under Review).
Incident template
- Summary: What’s wrong and who’s impacted.
- Scope: Affected dashboards and dates.
- Cause: Source, pipeline, transform, or BI layer.
- ETA: Next update time.
- Action: Workaround or hold communication.
- Follow-up: Prevention steps and owner.
Common mistakes and how to self-check
- Too many checks, no priorities → Fix: Protect the top 10 datasets first.
- Vague rules ("should be fresh") → Fix: Define exact times and numeric targets.
- Alerting everyone → Fix: Name specific owners per dataset.
- No business reconciliation → Fix: Compare to a trusted ledger or golden dataset.
- Never revisiting thresholds → Fix: Review quarterly and after incidents.
Self-check mini audit
- Can a new team member find the expectation doc in 1 minute?
- Do 3 critical dashboards show status badges?
- Did the last incident have a clear root cause and prevention item?
Exercises
Exercise 1: Define expectations for a Marketing Leads dashboard
You have a dashboard showing daily leads by channel. The sales team complains counts vary a lot day-to-day and sometimes arrive late.
- Choose 3 SLIs (freshness, completeness, validity/consistency).
- Set numeric SLO targets and acceptable ranges.
- Define alert conditions and ownership.
Hints
- Freshness relates to when marketing automation data lands.
- Completeness looks at nulls in lead_source or email.
- Validity uses basic rules (e.g., email format).
Show example solution
Checklist
- Each SLI is measurable.
- Each SLO has a numeric target and time window.
- Alert routes to a named owner.
- At least one reconciliation or sanity check exists.
Practical projects
- Quality One-Pagers: Create a one-page expectation doc for 3 top dashboards (SLIs, SLOs, owners, alerts).
- Freshness Monitor: Build a freshness status tile on your BI homepage using data lag metrics.
- Revenue Reconciliation: Implement a daily reconciliation between BI revenue and finance ledger with a delta badge.
Who this is for
- BI Developers and Analysts who ship dashboards to stakeholders.
- Analytics Engineers aligning pipelines with business usage.
- Team leads who need reliable metrics to make decisions.
Prerequisites
- Basic SQL (select, join, group by).
- Familiarity with your BI tool’s data sources and refresh mechanism.
- Understanding of core business metrics (sales, MRR, inventory).
Learning path
- Start: Data quality dimensions and SLIs/SLOs.
- Next: Layered checks (source, pipeline, transform, BI).
- Then: Alerting, incidents, and communication.
- Finally: Reconciliation and quarterly reviews.
Next steps
- Pick one critical dashboard and define 3 SLIs with targets today.
- Add a visible status badge to the dashboard.
- Schedule a 30-minute review with the business owner to confirm thresholds.
Mini challenge
In 10 minutes, write an expectation statement for a KPI your team uses daily. Include 1 freshness SLI, 1 completeness SLI, 1 validity SLI, and alert rules. Keep it to 5 sentences.
Progress and test
Quick Test is available to everyone. Logged-in users will have their progress saved automatically.
When ready, jump to the Quick Test section below.