How to learn Metric Consistency By Design for Dimensional Modeling For Analytics in Data Architect for free

Who this is for

This lesson is for Data Architects, Analytics Engineers, and BI Developers who need consistent, trusted metrics across dashboards, models, and teams.

Prerequisites

Basics of dimensional modeling (facts, dimensions, grain)
Comfort with SQL aggregations and joins
Familiarity with semantic layers or data marts (nice to have)

Why this matters

Leadership decisions rely on one number. Inconsistent definitions break trust and waste time in meetings.
Data Architects must encode metric rules (grain, filters, formulas) into models so every tool computes the same result.
Consistent metrics enable reliable experimentation, forecasting, and regulatory reporting.

Concept explained simply

Metric consistency by design means you make it hard to be inconsistent. You define metrics once, at a clear grain, with fixed filters and formulas, and you expose them from a single, governed place (a view, a semantic model, or a data mart). Dashboards should reuse the metric, not re-implement it.

Grain: the level at which the metric is computed (e.g., order item, day, user-session).
Filters: standardized inclusions/exclusions (e.g., only completed orders, exclude internal users).
Formula: precise calculation steps (e.g., net revenue = gross - discounts - refunds; exclude tax and shipping).
Conformed dimensions: shared dimensions (date, product, customer) with consistent keys and attributes.

Quick example

If you define Net Revenue at order-item grain, with status in Completed/Shipped, currency converted to USD using the rate at order date, then every dashboard querying that definition will match.

Mental model

Think of a metric as a triangle of Definition, Grain, and Filters. If any corner changes, the number changes. Lock all three in one place.

Definition: documented formula and business meaning.
Grain: lowest atomic level needed for robust roll-ups.
Filters: default constraints applied centrally, not in dashboards.

Three-layer guardrails

Data layer: atomic facts + conformed dimensions.
Semantic layer: reusable metric definitions.
Consumption: dashboards use metrics, not raw tables.

Design patterns that enforce consistency

Conformed dimensions across marts (date, customer, product, currency).
Atomic fact tables; derive metrics in governed views/semantic models.
Snapshots for balances (inventory, ARR); accumulating snapshots for funnel stages.
Default filters encoded in metric views (e.g., exclude test/internal users).
Role-playing date dimensions (order date vs ship date) to avoid ambiguity.
Surrogate keys and SCD rules defined once (SCD2 for history where needed).
Currency normalization via a currency rate dimension keyed by date.
Metric versioning (Revenue v1, Revenue v2) with deprecation windows.

Advanced guardrails

Semi-additive handling: sums across entities, last-known across time.
Attribution rules centralized (e.g., first-touch vs last-touch).
Late-arriving data reprocessing windows documented and automated.

Worked examples

Example 1: Net Revenue (multi-currency, refunds, tax)

Definition: Net Revenue (USD) = sum over order items of (unit_price * qty - item_discount - item_refund) converted to USD at order date. Exclude tax, shipping; include only Completed/Shipped.

Grain: order_item_id

-- Pseudo-SQL view: v_metric_net_revenue
SELECT
  oi.order_date,
  d_product.product_id,
  SUM((oi.unit_price * oi.qty - oi.item_discount - oi.item_refund) * fx.rate_to_usd) AS net_revenue_usd
FROM f_order_item oi
JOIN d_order_status s ON oi.status_id = s.status_id AND s.final = true
JOIN d_currency_rate fx ON fx.from_ccy = oi.ccy AND fx.to_ccy = 'USD' AND fx.rate_date = oi.order_date
JOIN d_product ON oi.product_id = d_product.product_id
WHERE oi.is_tax = false AND oi.is_shipping = false
GROUP BY 1,2;

Because grain, filters, and currency logic live in the view, every dashboard matches.

Example 2: Weekly Active Users (WAU)

Definition: Unique users with at least one qualifying event in the last 7 days, computed daily and rolled up to weeks via calendar.

Grain: user_id by snapshot_date

-- v_metric_wau_daily
WITH active AS (
  SELECT snapshot_date, global_user_id
  FROM f_event e
  JOIN d_user u ON e.user_id = u.user_id
  WHERE u.is_internal = false AND e.event_qualifies = true
  GROUP BY snapshot_date, global_user_id
)
SELECT dcal.week_start, COUNT(DISTINCT a.global_user_id) AS wau
FROM d_calendar dcal
LEFT JOIN active a ON a.snapshot_date BETWEEN dcal.week_start AND dcal.week_start + INTERVAL '6 days'
GROUP BY 1;

Conformance via global_user_id prevents double-counting across platforms.

Example 3: Conversion Rate (session to purchase)

Definition: number of users who purchased within 7 days of a qualified session divided by number of users with a qualified session in that week. Attribution: last qualified session within the window.

Grain: user-week

-- v_metric_conversion_user_week
WITH sessions AS (
  SELECT user_id, week_start
  FROM v_qualified_sessions
  GROUP BY 1,2
), purchases AS (
  SELECT user_id, week_start
  FROM v_user_purchases_attributed_last_touch_7d
  GROUP BY 1,2
)
SELECT s.week_start,
  COUNT(DISTINCT p.user_id) * 1.0 / NULLIF(COUNT(DISTINCT s.user_id),0) AS conversion_rate
FROM sessions s
LEFT JOIN purchases p USING (user_id, week_start)
GROUP BY 1;

Denominator and numerator share the same grain and window to avoid inflated rates.

Example 4: Inventory balance (semi-additive)

Definition: End-of-day on-hand quantity by product and warehouse; sum across warehouses is valid; across time use last-known.

Grain: product-warehouse-day snapshot

-- v_inventory_eod
SELECT snapshot_date, product_id, warehouse_id, on_hand_qty
FROM f_inventory_snapshot;

-- For month end balance
SELECT month_end, product_id, SUM(on_hand_qty) AS on_hand_qty
FROM (
  SELECT d.month_end, i.product_id, i.warehouse_id,
         FIRST_VALUE(i.on_hand_qty) OVER (PARTITION BY i.product_id, i.warehouse_id, d.month_end ORDER BY i.snapshot_date DESC) AS on_hand_qty
  FROM d_calendar_month d
  JOIN f_inventory_snapshot i ON i.snapshot_date <= d.month_end
) x
GROUP BY 1,2;

Using snapshots and appropriate aggregation avoids double-counting and drift.

Step-by-step design process

Write the business question in one sentence. Example: What is weekly net revenue in USD by product line?
Choose the atomic grain you need (order item, user-event, snapshot).
Define inclusions/exclusions as default filters (statuses, internal/test users, currency rules).
Specify the exact formula and any windows (e.g., 7-day lookback).
Map required conformed dimensions and keys.
Implement as a governed view or semantic metric with tests.
Version the metric if changing meaning; communicate deprecation timeline.

Design checklist

Single source of truth view/metric exists
Grain documented and testable
Default filters encoded in SQL/semantic layer
Conformed dimensions joined with surrogate keys
Semi-additive logic handled where needed
Version and owner documented

Governance and change control

Metric catalog: name, owner, version, grain, filters, formula, tests.
Versioning: introduce v2 side-by-side; set a deprecation date for v1.
Backfill policy: how far to recompute after late data corrections.
Data quality gates: block or flag metric outputs when inputs fail checks.

Common mistakes and self-check

Different grains in numerator vs denominator (inflated rates). Fix by aligning grain before division.
Ad-hoc filters in dashboards (exclude internal users) not encoded centrally. Fix by moving to the metric view.
Mixing currencies without consistent conversion date. Fix by joining currency rate on transaction date.
Summing semi-additive balances across time. Fix by using last-known or snapshot logic.

Self-check prompts

Can two teams implement the metric without asking you questions?
Is there exactly one authoritative place to compute it?
Do tests fail if grain/filters/formula change?

Exercises

Do these now. They mirror the graded exercises below.

Define a canonical Net Revenue metric across currencies with refunds and discounts. Specify grain, filters, conversion logic, and output columns.
Define WAU using cross-platform events. Specify deduplication via a global user key, the 7-day window, and internal user exclusions.

Pre-submit checklist

Grain clearly stated
Default filters explicit
Formula unambiguous
Conformed dimensions identified
Edge cases addressed (refunds, late events, internal users)

Practical projects

Build a small metric mart with three metrics: Net Revenue, WAU, and Conversion Rate. Publish as three governed views with tests.
Refactor a dashboard that reimplements filters into one that only queries governed metrics. Document the change impacts.
Introduce metric versioning: create Revenue v2 that changes tax handling; run a side-by-side comparison and a deprecation plan.

Mini challenge

Your marketing team wants a new metric: Qualified Leads Converted (QLC) defined as leads that become paying customers within 30 days of first contact. Design the metric so it stays consistent across tools. Write grain, filters, formula, and any windows.

Suggested approach

Grain: lead_id
Filters: exclude test/demo leads; first_contact_channel in approved set
Formula: count of lead_id where exists a paid_customer event within 30 days of first_contact_date
Exposure: v_metric_qlc view joining d_lead, f_events, d_calendar; document attribution (first-touch)

Learning path

Up next: conformed dimensions deep dive and semantic modeling patterns.
Then: data tests for metrics (grain, filters, invariants) and DQ gates.
Finally: change management and metric versioning in production.

Next steps

Finish the exercises below and compare with the provided solutions.
Take the Quick Test at the end of this page. Everyone can take it; sign in if you want your progress saved.
Apply one design change to an existing metric in your environment this week.

Quick test

Quick Test is available to everyone. Sign in to keep your results saved for later.

Menu

Metric Consistency By Design

Table of Contents