How to learn Data Consistency Across Reports for Data Quality Checks in BI Analyst for free

Why this matters

As a BI Analyst, stakeholders expect the same metric to match across dashboards, finance reports, and ad-hoc analyses. Inconsistency erodes trust and wastes time. You will often be asked to explain why yesterday's revenue in the Executive Dashboard differs from the Finance P&L or why customer counts do not match between Marketing and Product reports. Mastering consistency means faster decisions and fewer escalations.

Real task: reconcile KPI mismatches across dashboards before a leadership meeting.
Real task: define and document a single source-of-truth for key metrics.
Real task: prevent double counting from joins or pre-aggregations.

Concept explained simply

Data consistency across reports means the same metric yields the same value for the same business question, timeframe, and filters—no matter which report you open.

Mental model: the Metric Contract

Think of every metric as having a contract. If two reports obey the same contract, they match.

Definition: exact formula and inclusions/exclusions.
Scope: source tables/views and data freshness.
Time: timezone, calendar (fiscal vs calendar), time grain.
Filters: default filters, segments, status flags.
Dimensions: join keys, slowly changing dimensions behavior.

Contract match = consistent numbers.

Common causes of inconsistencies

Different formulas (gross vs net, handling of refunds, cancellations, taxes, fees).
Different sources (orders vs invoices; live vs snapshot; pre-aggregated marts).
Join duplications (many-to-many joins; non-unique dimension keys).
Different filters (status IN/OUT, cohorts, geos, product lines).
Time differences (UTC vs local; fiscal vs calendar; month-end boundary).
Time grain mismatch (daily vs monthly aggregation first).
Currency conversion timing (rate date mismatch or missing FX table join).
Late-arriving data or partial refresh windows.
Slowly changing dimensions (current vs as-of attributes).

Worked examples

Example 1 — Double-counted revenue after joining promotions

Symptom: Dashboard A shows revenue 1.26M; Dashboard B shows 1.20M for the same month.

Investigation:

Dashboard A joins orders to promotions on order_id; promotions table has multiple rows per order when multiple promos applied.
SUM(o.amount) after a many-to-many join duplicates amounts.

Fix options:

Aggregate promotions to one row per order before join.
Join using a distinct subquery of order_ids.
Compute revenue purely from orders, and join promos only for promo reporting.

WITH promo_one AS (
  SELECT order_id FROM promotions GROUP BY order_id
)
SELECT SUM(o.amount)
FROM orders o
LEFT JOIN promo_one p ON p.order_id = o.order_id;

Example 2 — Timezone and month boundaries

Symptom: Monthly revenue differs by ~2%.

Investigation:

Report A groups by order_created_at in UTC.
Report B uses local time (America/Los_Angeles). End-of-month orders differ by timezone cutoff.

Fix:

Standardize reporting timezone (e.g., UTC) and document.
If business requires local time, convert consistently before grouping.

Example 3 — Filter defaults not aligned

Symptom: Active customers differ by ~8%.

Investigation: One report excludes trial users; another includes them by default.

Fix: Agree on whether trials count as customers in the metric contract. Apply the same filter default everywhere and surface it in the report header.

Example 4 — Snapshot vs live tables

Symptom: End-of-month subscriptions differ.

Investigation: Finance uses end-of-month snapshots; Product uses live status queried later (some churn already processed).

Fix: For point-in-time metrics, both reports must use the same snapshot date (as-of). Document the as-of rule.

Practical checklist

[ ] Does every report use the same metric definition and formula?
[ ] Are source tables and data freshness identical or explicitly aligned?
[ ] Are timezone and calendar rules consistent?
[ ] Is time grain (daily/weekly/monthly) aligned before aggregation?
[ ] Are default filters the same and visible?
[ ] Do joins preserve row-level uniqueness? Check keys.
[ ] Are currency conversions using the same rate date and table?
[ ] For point-in-time metrics, is there an as-of snapshot rule?

How to self-check

Run a small-scope reconciliation: pick one week and one segment. Numbers should match there first.
Trace lineage: list each transformation from source to report for both reports. Compare step-by-step.
Create a "side-by-side" query that computes both methods in the same SQL to isolate differences fast.

Template: side-by-side reconciliation SQL

WITH a AS (
  -- Method A
  SELECT DATE_TRUNC('day', created_at) d, SUM(amount) revenue
  FROM orders
  WHERE status IN ('paid','shipped')
  GROUP BY 1
),
b AS (
  -- Method B
  SELECT DATE_TRUNC('day', posted_at) d, SUM(net_amount_usd) revenue
  FROM invoices
  WHERE invoice_status = 'posted'
  GROUP BY 1
)
SELECT coalesce(a.d,b.d) d, a.revenue rev_a, b.revenue rev_b, (coalesce(a.revenue,0)-coalesce(b.revenue,0)) diff
FROM a
FULL OUTER JOIN b USING (d)
ORDER BY d;

Exercises

Do these hands-on tasks. Then compare with the provided solutions.

Exercise 1 — Reconciling revenue across Orders vs Invoices

You have last month numbers:

Sales Dashboard (Report A): 1,200,000 using orders.amount on order_created_at (UTC), status IN ('paid','shipped'). Refunds not subtracted.
Finance P&L (Report B): 1,150,000 using invoices.net_amount_usd on invoice_posted_at (local time), invoice_status = 'posted', excludes marketplace orders, uses daily FX.
Operational facts: 200 refunded orders totaling 50,000; 120 late invoices posted on the 1st of this month totaling 30,000; marketplace orders last month totaled 15,000.

Task: Identify likely causes for the 50,000 difference and propose a reconciliation plan. Write:

A numbered list of root causes.
A short plan to align the reports.

Hints

Think gross vs net.
Check month boundary by timezone.
Consider exclusions like marketplace.
Late-arriving postings move value across months.

Show solution

Likely causes:

Refund handling: Report A is gross; Report B is net. Refunds (50,000) explain a large part.
Late invoices (30,000) posted on the 1st are in Report B next month, not last month.
Marketplace exclusion (15,000) only in Report B.
Timezone boundary could shift a few orders between months.

Plan:

Define revenue as net of refunds and fees for both reports.
Use the same time basis: invoice_posted_at (as revenue recognition) or order_created_at consistently; document choice.
Apply the same inclusion rule for marketplace (include or exclude in both).
Standardize timezone (UTC) and, if needed, align to fiscal calendar.
Publish a metric contract and update both models. Provide a migration note for users.

-- Example alignment using invoices (net, posted, exclude marketplace)
SELECT DATE_TRUNC('month', posted_at AT TIME ZONE 'UTC') m,
       SUM(net_amount_usd) revenue
FROM invoices
WHERE invoice_status='posted'
  AND is_marketplace = false
GROUP BY 1;

Exercise 2 — Active customers definition alignment

Two reports:

Report A (Product): Active customers = users with an order in the last 90 days. Timezone: UTC. Excludes test accounts.
Report B (Growth): Active customers = users with subscription_status='active' as of month-end snapshot. Timezone: local. Corporate accounts can have multiple child users.
Numbers for last month-end: A=12,400; B=11,950.

Task: List the top 3 drivers of mismatch and propose a single standard definition for "Active customers" and how to implement it. Include:

Your chosen definition
Time basis (snapshot vs rolling)
Deduplication or hierarchy handling

Hints

Rolling window vs as-of snapshot are different questions.
Corporate hierarchies can inflate counts.
Timezone consistency matters at month-end.

Show solution

Drivers:

Different logic: rolling activity (A) vs end-of-month status (B).
Hierarchy: corporate accounts cause duplicates (count users vs billing account).
Timezone cutoff differences.

Standard definition (example): Active customer = unique billing_account_id with subscription_status='active' as-of the last calendar day of the month at 23:59:59 UTC.

Implementation:

-- Use month-end subscription snapshot and count distinct billing accounts
SELECT month_end,
       COUNT(DISTINCT billing_account_id) AS active_customers
FROM subscription_snapshot
WHERE as_of_ts = month_end
  AND subscription_status='active'
GROUP BY 1;

Notes:

Document as-of rule and timezone (UTC).
If a rolling activity metric is also needed, define it separately with a different name.

Common mistakes

Assuming matching names mean matching definitions. Always verify the contract.
Aggregating after a many-to-many join. Ensure uniqueness first.
Ignoring calendar and timezone differences when comparing months.
Mixing gross and net values.
Using different currency rate dates across reports.
Comparing live tables to snapshots without an as-of rule.

Mini challenge

Pick one key metric in your org. In a single paragraph, write its Metric Contract: definition, source, time, filters, and dimensions. Then, list two reports that must use it and note what you need to change for alignment. Keep it short but precise.

Who this is for

BI Analysts ensuring trusted dashboards.
Analytics Engineers building semantic layers.
Data Scientists validating experiment metrics.

Prerequisites

Comfort with SQL joins and aggregations.
Basic understanding of data models (facts/dimensions).
Familiarity with reporting tools and filters.

Learning path

Master metric contracts and documentation.
Learn reconciliation techniques (side-by-side SQL, sampling).
Standardize time and currency logic.
Harden joins and uniqueness guarantees.
Automate consistency checks in your BI layer.

Practical projects

Create a metric contract doc for the top 5 KPIs and embed it in dashboard headers.
Build a reconciliation query that compares two revenue pipelines daily and flags variance > 1%.
Add a semantic layer view that enforces timezone and calendar consistently.

Next steps

Adopt a naming convention for metrics and snapshots.
Schedule periodic variance checks.
Educate stakeholders on definitions via short notes in dashboards.

Quick test

The quick test is available to everyone; only logged-in users will have progress saved.

Menu

Data Consistency Across Reports

Table of Contents

Why this matters

Concept explained simply

Mental model: the Metric Contract

Common causes of inconsistencies

Worked examples

Practical checklist

How to self-check

Exercises

Exercise 1 — Reconciling revenue across Orders vs Invoices

Exercise 2 — Active customers definition alignment

Common mistakes

Mini challenge

Who this is for

Prerequisites

Learning path

Practical projects

Next steps

Quick test

Practice Exercises

Reconciling revenue across Orders vs Invoices

Instructions

Expected Output

Aligning Active Customers definitions

Data Consistency Across Reports — Quick Test

Have questions about Data Consistency Across Reports?

AI Assistant