Who this is for
This lesson is for Data Architects, Analytics Engineers, and BI Developers who need consistent, trusted metrics across dashboards, models, and teams.
Prerequisites
- Basics of dimensional modeling (facts, dimensions, grain)
- Comfort with SQL aggregations and joins
- Familiarity with semantic layers or data marts (nice to have)
Why this matters
- Leadership decisions rely on one number. Inconsistent definitions break trust and waste time in meetings.
- Data Architects must encode metric rules (grain, filters, formulas) into models so every tool computes the same result.
- Consistent metrics enable reliable experimentation, forecasting, and regulatory reporting.
Concept explained simply
Metric consistency by design means you make it hard to be inconsistent. You define metrics once, at a clear grain, with fixed filters and formulas, and you expose them from a single, governed place (a view, a semantic model, or a data mart). Dashboards should reuse the metric, not re-implement it.
- Grain: the level at which the metric is computed (e.g., order item, day, user-session).
- Filters: standardized inclusions/exclusions (e.g., only completed orders, exclude internal users).
- Formula: precise calculation steps (e.g., net revenue = gross - discounts - refunds; exclude tax and shipping).
- Conformed dimensions: shared dimensions (date, product, customer) with consistent keys and attributes.
Quick example
If you define Net Revenue at order-item grain, with status in Completed/Shipped, currency converted to USD using the rate at order date, then every dashboard querying that definition will match.
Mental model
Think of a metric as a triangle of Definition, Grain, and Filters. If any corner changes, the number changes. Lock all three in one place.
- Definition: documented formula and business meaning.
- Grain: lowest atomic level needed for robust roll-ups.
- Filters: default constraints applied centrally, not in dashboards.
Three-layer guardrails
- Data layer: atomic facts + conformed dimensions.
- Semantic layer: reusable metric definitions.
- Consumption: dashboards use metrics, not raw tables.
Design patterns that enforce consistency
- Conformed dimensions across marts (date, customer, product, currency).
- Atomic fact tables; derive metrics in governed views/semantic models.
- Snapshots for balances (inventory, ARR); accumulating snapshots for funnel stages.
- Default filters encoded in metric views (e.g., exclude test/internal users).
- Role-playing date dimensions (order date vs ship date) to avoid ambiguity.
- Surrogate keys and SCD rules defined once (SCD2 for history where needed).
- Currency normalization via a currency rate dimension keyed by date.
- Metric versioning (Revenue v1, Revenue v2) with deprecation windows.
Advanced guardrails
- Semi-additive handling: sums across entities, last-known across time.
- Attribution rules centralized (e.g., first-touch vs last-touch).
- Late-arriving data reprocessing windows documented and automated.
Worked examples
Example 1: Net Revenue (multi-currency, refunds, tax)
Definition: Net Revenue (USD) = sum over order items of (unit_price * qty - item_discount - item_refund) converted to USD at order date. Exclude tax, shipping; include only Completed/Shipped.
Grain: order_item_id
-- Pseudo-SQL view: v_metric_net_revenue SELECT oi.order_date, d_product.product_id, SUM((oi.unit_price * oi.qty - oi.item_discount - oi.item_refund) * fx.rate_to_usd) AS net_revenue_usd FROM f_order_item oi JOIN d_order_status s ON oi.status_id = s.status_id AND s.final = true JOIN d_currency_rate fx ON fx.from_ccy = oi.ccy AND fx.to_ccy = 'USD' AND fx.rate_date = oi.order_date JOIN d_product ON oi.product_id = d_product.product_id WHERE oi.is_tax = false AND oi.is_shipping = false GROUP BY 1,2;
Because grain, filters, and currency logic live in the view, every dashboard matches.
Example 2: Weekly Active Users (WAU)
Definition: Unique users with at least one qualifying event in the last 7 days, computed daily and rolled up to weeks via calendar.
Grain: user_id by snapshot_date
-- v_metric_wau_daily WITH active AS ( SELECT snapshot_date, global_user_id FROM f_event e JOIN d_user u ON e.user_id = u.user_id WHERE u.is_internal = false AND e.event_qualifies = true GROUP BY snapshot_date, global_user_id ) SELECT dcal.week_start, COUNT(DISTINCT a.global_user_id) AS wau FROM d_calendar dcal LEFT JOIN active a ON a.snapshot_date BETWEEN dcal.week_start AND dcal.week_start + INTERVAL '6 days' GROUP BY 1;
Conformance via global_user_id prevents double-counting across platforms.
Example 3: Conversion Rate (session to purchase)
Definition: number of users who purchased within 7 days of a qualified session divided by number of users with a qualified session in that week. Attribution: last qualified session within the window.
Grain: user-week
-- v_metric_conversion_user_week WITH sessions AS ( SELECT user_id, week_start FROM v_qualified_sessions GROUP BY 1,2 ), purchases AS ( SELECT user_id, week_start FROM v_user_purchases_attributed_last_touch_7d GROUP BY 1,2 ) SELECT s.week_start, COUNT(DISTINCT p.user_id) * 1.0 / NULLIF(COUNT(DISTINCT s.user_id),0) AS conversion_rate FROM sessions s LEFT JOIN purchases p USING (user_id, week_start) GROUP BY 1;
Denominator and numerator share the same grain and window to avoid inflated rates.
Example 4: Inventory balance (semi-additive)
Definition: End-of-day on-hand quantity by product and warehouse; sum across warehouses is valid; across time use last-known.
Grain: product-warehouse-day snapshot
-- v_inventory_eod
SELECT snapshot_date, product_id, warehouse_id, on_hand_qty
FROM f_inventory_snapshot;
-- For month end balance
SELECT month_end, product_id, SUM(on_hand_qty) AS on_hand_qty
FROM (
SELECT d.month_end, i.product_id, i.warehouse_id,
FIRST_VALUE(i.on_hand_qty) OVER (PARTITION BY i.product_id, i.warehouse_id, d.month_end ORDER BY i.snapshot_date DESC) AS on_hand_qty
FROM d_calendar_month d
JOIN f_inventory_snapshot i ON i.snapshot_date <= d.month_end
) x
GROUP BY 1,2;
Using snapshots and appropriate aggregation avoids double-counting and drift.
Step-by-step design process
- Write the business question in one sentence. Example: What is weekly net revenue in USD by product line?
- Choose the atomic grain you need (order item, user-event, snapshot).
- Define inclusions/exclusions as default filters (statuses, internal/test users, currency rules).
- Specify the exact formula and any windows (e.g., 7-day lookback).
- Map required conformed dimensions and keys.
- Implement as a governed view or semantic metric with tests.
- Version the metric if changing meaning; communicate deprecation timeline.
Design checklist
- Single source of truth view/metric exists
- Grain documented and testable
- Default filters encoded in SQL/semantic layer
- Conformed dimensions joined with surrogate keys
- Semi-additive logic handled where needed
- Version and owner documented
Governance and change control
- Metric catalog: name, owner, version, grain, filters, formula, tests.
- Versioning: introduce v2 side-by-side; set a deprecation date for v1.
- Backfill policy: how far to recompute after late data corrections.
- Data quality gates: block or flag metric outputs when inputs fail checks.
Common mistakes and self-check
- Different grains in numerator vs denominator (inflated rates). Fix by aligning grain before division.
- Ad-hoc filters in dashboards (exclude internal users) not encoded centrally. Fix by moving to the metric view.
- Mixing currencies without consistent conversion date. Fix by joining currency rate on transaction date.
- Summing semi-additive balances across time. Fix by using last-known or snapshot logic.
Self-check prompts
- Can two teams implement the metric without asking you questions?
- Is there exactly one authoritative place to compute it?
- Do tests fail if grain/filters/formula change?
Exercises
Do these now. They mirror the graded exercises below.
- Define a canonical Net Revenue metric across currencies with refunds and discounts. Specify grain, filters, conversion logic, and output columns.
- Define WAU using cross-platform events. Specify deduplication via a global user key, the 7-day window, and internal user exclusions.
Pre-submit checklist
- Grain clearly stated
- Default filters explicit
- Formula unambiguous
- Conformed dimensions identified
- Edge cases addressed (refunds, late events, internal users)
Practical projects
- Build a small metric mart with three metrics: Net Revenue, WAU, and Conversion Rate. Publish as three governed views with tests.
- Refactor a dashboard that reimplements filters into one that only queries governed metrics. Document the change impacts.
- Introduce metric versioning: create Revenue v2 that changes tax handling; run a side-by-side comparison and a deprecation plan.
Mini challenge
Your marketing team wants a new metric: Qualified Leads Converted (QLC) defined as leads that become paying customers within 30 days of first contact. Design the metric so it stays consistent across tools. Write grain, filters, formula, and any windows.
Suggested approach
- Grain: lead_id
- Filters: exclude test/demo leads; first_contact_channel in approved set
- Formula: count of lead_id where exists a paid_customer event within 30 days of first_contact_date
- Exposure: v_metric_qlc view joining d_lead, f_events, d_calendar; document attribution (first-touch)
Learning path
- Up next: conformed dimensions deep dive and semantic modeling patterns.
- Then: data tests for metrics (grain, filters, invariants) and DQ gates.
- Finally: change management and metric versioning in production.
Next steps
- Finish the exercises below and compare with the provided solutions.
- Take the Quick Test at the end of this page. Everyone can take it; sign in if you want your progress saved.
- Apply one design change to an existing metric in your environment this week.
Quick test
Quick Test is available to everyone. Sign in to keep your results saved for later.