Why this matters
BI Analysts are asked to deliver trustworthy numbers across dashboards, self-serve tools, and ad‑hoc analysis. When every team defines the same KPI differently, decisions drift. A semantic layer fixes this by defining metrics once, centrally, and reusing them everywhere.
- Real tasks you will face:
- Documenting the single source of truth for KPIs like Revenue, Active Users, and Conversion Rate.
- Translating business rules into unambiguous metric logic and valid dimensions.
- Preventing double counting across joins and time windows.
- Managing changes without breaking reports.
Who this is for
- BI Analysts and Analytics Engineers who own KPI definitions.
- Product/Data Analysts who build dashboards or self-serve datasets.
- Team leads who need consistent performance metrics.
Prerequisites
- Comfort with SQL joins, aggregates, and window functions.
- Basic understanding of star schemas and dimensions/facts.
- Familiarity with at least one BI tool or semantic modeling approach.
Concept explained simply
Define a metric once in your semantic layer and everyone uses the same definition. That single definition includes name, formula, filters, time grain, valid dimensions, default aggregation, and owner/governance. Tools then reference that definition rather than re‑implementing it.
Mental model
Think of a metric as a contract. The contract states exactly how the number is calculated and in which contexts it remains valid. If the contract changes, you version it and communicate the change.
Metric spec checklist (copy and use)
- Name and business description
- Formula (clear SQL/pseudocode), including numerator and denominator if applicable
- Included/excluded records (filters)
- Time grain and default time window
- Aggregation behavior: additive, semi‑additive, or non‑additive
- Allowed dimensions (safe to group by) and prohibited dimensions
- Primary keys and join assumptions
- Edge cases (refunds, cancellations, timezone, late‑arriving data)
- Data quality tests and acceptance thresholds
- Owner, version, change notes, deprecation plan
Worked examples
Example 1: Net Revenue
Business intent: Money kept from completed sales after discounts and refunds.
- Formula (pseudocode): SUM(order_amount - discounts - refunds) on completed orders.
- Filters: order_status in ('completed', 'fulfilled'); currency normalized to USD via daily FX; exclude test orders.
- Time grain: day; Default window: last 28 days.
- Aggregation: Additive across orders and time; not additive across currency unless normalized.
- Allowed dimensions: date, product_category, region, channel, customer_segment.
- Edge cases: Late refunds booked within 30 days adjust the day of refund, not original order.
- Tests: Daily null-rate < 0.1%; FX join coverage > 99.5%.
Example 2: DAU (Daily Active Users)
Business intent: Unique users who were active on a given day.
- Formula: COUNT DISTINCT user_id from events where event_type in activity_events.
- Filters: Exclude internal/test users; activity_events defined by product.
- Time grain: day (timezone = product canonical TZ, e.g., UTC).
- Aggregation: Non‑additive across time; report as distinct by day; rolling windows require windowed logic.
- Allowed dimensions: date, platform, country, acquisition_channel.
- Edge cases: Late events accepted for 24 hours; after that, they are discarded from DAU.
- Tests: For a sample day, DAU ≥ MAU/31 and ≤ MAU; internal users = 0.
Example 3: Conversion Rate (Visitor → Purchase)
Business intent: Share of unique visitors who made at least one purchase in the same session window.
- Numerator: Distinct visitors with ≥1 completed order.
- Denominator: Distinct visitors who had a visit (session started).
- Filters: Bots excluded; only paid orders count; same 7‑day attribution window from first visit.
- Time grain: week; Default window: last 12 weeks.
- Aggregation: Non‑additive; always compute as ratio of distinct users at the reporting grain (avoid averaging daily rates).
- Allowed dimensions: week, device_type, region, campaign_group.
- Tests: Rate between 0% and 100%; sanity bounds alert if > 20% week‑over‑week change.
How to define once for everyone
- Gather business rules via short stakeholder interviews; write examples of in/out data.
- Draft the metric spec using the checklist above.
- Validate in SQL with a reproducible query and sample outputs.
- Add tests (row coverage, bounds, uniqueness).
- Publish in your semantic layer and tag with version and owner.
- Communicate the definition and where it applies; migrate dashboards.
Exercises
Do these before the quick test. Aim for clarity, then precision.
- Mini checklist before you submit:
- Clear name and description
- Formula and filters are explicit
- Time grain and window stated
- Allowed dimensions listed
- Edge cases and tests included
Exercise 1: Write a metric spec (mirrors Exercise ex1)
Create a complete spec for Net Revenue for your context using the checklist.
Exercise 2: Unify conflicting definitions (mirrors Exercise ex2)
Resolve two competing Conversion Rate definitions into one contract, including a short migration plan.
Common mistakes and self-checks
- Mistake: Averaging ratios over time. Self-check: Recompute as total numerator / total denominator at the reporting grain.
- Mistake: Double counting after joins. Self-check: Confirm grain and primary keys; use distincts or proper bridge tables.
- Mistake: Ambiguous timezones/windows. Self-check: Declare timezone and attribution window in the spec.
- Mistake: Missing refund/cancellation logic. Self-check: Add explicit inclusion/exclusion and timing rules.
- Mistake: Silent changes. Self-check: Version every breaking change; note owner and date.
Practical projects
- Spreadsheet semantic layer: Model three metrics (Net Revenue, DAU, Conversion Rate) from raw exports; write specs and tests; share the sheet with read-only formulas.
- SQL views: Build standardized views that implement your specs; create a small dashboard that references only these views.
- Change simulation: Introduce a breaking change (e.g., new refund rule) and run a mock version bump with release notes.
Learning path
- Before: Star schemas and conformed dimensions; basic SQL QA checks.
- Now: Defining metrics once for everyone (this lesson).
- Next: Metric tests, lineage, and controlled rollouts; enabling metrics in self‑serve tools.
Quick Test note
The quick test below is available to everyone. Only logged‑in users have their progress saved.
Mini challenge
Pick one high‑impact KPI in your organization and write a one‑page metric contract using the checklist. Share it with two stakeholders and incorporate feedback. Update your spec and add two automated tests.
Next steps
- Adopt the checklist for every new KPI.
- Add owner and version to existing metrics; write a short deprecation plan for old definitions.
- Run the quick test to confirm understanding, then start a small project from the list above.