Why this matters
Every KPI you publish depends on three pillars: measures (the numbers), dimensions (how you slice), and attributes (extra properties for labeling and filtering). Get these right, and your dashboards stay consistent across teams. Get them wrong, and you ship misleading charts and broken decisions.
- Daily work impact: define revenue and counts correctly so finance and product see the same truth.
- Speed: a clean semantic layer lets stakeholders self-serve without asking you to rebuild queries.
- Governance: clear definitions prevent silent metric drift between reports.
Concept explained simply
- Measure: a numeric value you aggregate (sum, avg, count). Example: Revenue, Orders, Users (distinct).
- Dimension: a field to group or filter by. Example: Date, Country, Product, Channel.
- Attribute: a property of a dimension used for labeling/sorting/filtering, not for summing. Example: Product Brand, Country Region, Customer Tier.
Mental model
Think of a sentence: "Sum Sales by Product Category over Month."
- Sales = measure (number you aggregate)
- Product Category, Month = dimensions (the by/over parts)
- Brand, Color, Segment = attributes (descriptors tied to a dimension)
Rule of thumb: if a value gets bigger when you add rows, it is a measure. If it breaks rows into groups, it is a dimension. If it describes a dimension and shouldn’t be added up, it’s an attribute.
Measure details and aggregation choices
- Additive: safe to sum across any dimension (e.g., Revenue).
- Semi-additive: safe across some dimensions but not time (e.g., Inventory at end of day). Handle with period-end or average-over-period logic.
- Non-additive: ratios/percentages (e.g., Conversion Rate). Aggregate from components, not by averaging ratios.
Common aggregations: Sum, Count, Count Distinct, Average, Min/Max. For ratios, define them as derived measures using consistent numerators/denominators (e.g., CTR = Clicks / Impressions).
Derived measures safely
- Average Order Value (AOV) = Revenue / Orders (use sums, not row-level averages).
- Gross Margin % = (Revenue - COGS) / Revenue.
- ROAS = Revenue / Spend.
Never sum ratios across rows; recompute from aggregated components.
Dimensions and attributes in practice
- Dimensions can have hierarchies: Date → Month → Quarter → Year; Product → Category → Department.
- Attributes are properties of a dimension key: Product Brand, Customer Segment, City Region.
- Role-playing dimensions: the same dimension used in different roles (Order Date, Ship Date).
- Conformed dimensions: shared across subject areas so "Country" means the same in Sales and Support.
- Sort/display: keep a key (e.g., product_id), a display name (Product Name), and a sort attribute when alphabetical order is wrong (e.g., Month Number to sort Jan..Dec).
Worked examples
Example 1 — E-commerce Orders
- Grain: one row per order line.
- Measures: Revenue = unit_price * quantity (sum), Units = quantity (sum), Orders = distinct order_id (count distinct), Discount Amount (sum).
- Dimensions: Date, Customer, Product, Channel.
- Attributes: Product Category, Brand, Customer Segment, Channel Type.
- Derived: AOV = sum(Revenue) / count_distinct(order_id).
Example 2 — SaaS Subscriptions
- Grain: one row per customer-month snapshot.
- Measures: MRR (semi-additive over time), New MRR (sum), Churned MRR (sum), Active Customers (distinct count).
- Dimensions: Snapshot Month, Plan, Region.
- Attributes: Plan Family, Region Group.
- Note: Reporting MRR across months uses month-end values or average of daily snapshots, not simple sums across time.
Example 3 — Retail Inventory
- Grain: one row per product-store-day.
- Measures: Inventory On Hand (semi-additive across time), Receipts (sum), Sales Units (sum).
- Dimensions: Date, Store, Product.
- Attribute: Product Size, Store Format.
- Derived: Stock Turnover = sum(Sales Units) / avg(Inventory On Hand).
Example 4 — Support Tickets
- Grain: one row per ticket.
- Measures: Tickets Created (count), Resolution Time Minutes (avg/median), SLA Breaches (count), CSAT Score (avg).
- Dimensions: Created Date, Channel, Priority, Assignee Team.
- Attributes: Priority Rank (for sorting), Channel Group.
Modeling guidelines (quick steps)
- Define the business grain: one row per what? (order line, ticket, daily snapshot)
- List measures and mark type: additive, semi-additive, non-additive.
- Pick default aggregations: sum, count distinct, avg, min/max.
- Identify dimensions and attach attributes (keys, names, sort-by fields).
- Name clearly with units: "Revenue (USD)", "Customers (distinct)".
- Write a one-line definition for each measure to avoid drift.
- Add guardrails: compute ratios from sums, use period-end for semi-additive, set unknown buckets for nulls.
Who this is for and prerequisites
Who this is for
- BI Analysts who define metrics and curate datasets for stakeholders.
- Analytics Engineers and Data Analysts aligning metrics across tools.
Prerequisites
- Basic SQL (GROUP BY, SUM/COUNT/AVG).
- Familiarity with fact/dimension concepts.
Learning path
- Step 1: Choose the grain and list measures.
- Step 2: Assign default aggregations and write short definitions.
- Step 3: Map dimensions and attach attributes (keys, labels, sort fields).
- Step 4: Define derived measures and time-aware logic (e.g., YTD, period-end).
- Step 5: Validate with a small dashboard and cross-check totals with a ground truth report.
Common mistakes and self-check
- Summing ratios (e.g., averaging conversion rates). Fix: compute from aggregated numerator/denominator.
- Treating semi-additive measures as additive across time. Fix: use last/avg over period.
- Using names as keys (e.g., Product Name). Fix: use stable IDs; keep names as attributes.
- Missing sort attributes (e.g., months alphabetical). Fix: add month_number for sorting.
- Inconsistent definitions across teams. Fix: centralize measure descriptions in the semantic layer.
- Overusing distinct counts. Fix: cache/aggregate at the right grain; validate cardinality.
Self-check
- Can you state the grain in one sentence?
- For each measure, can you name its default aggregation and additivity?
- Can every dimension be joined from a stable key?
- Do ratios recompute from sums at any grouping level?
Exercises
These mirror the interactive exercises below. Do them first, then take the quick test. Note: anyone can take the test; only logged-in users get saved progress.
- Exercise 1: Classify fields in a ride-hailing dataset and choose aggregations and grain.
- Exercise 2: Choose correct aggregations and define derived measures for a marketing dataset.
- Checklist before you move on:
- You can label each field as measure, dimension, or attribute.
- You know which measures are semi-additive.
- Ratios are defined from base components, not averaged.
Practical projects
- Build a mini sales mart: one orders fact, three dimensions (date, product, customer). Publish AOV, Revenue, Orders (distinct), and Margin%.
- Create a time-aware inventory model with period-end logic and demonstrate why summing inventory over months is misleading.
- Design a role-playing date dimension (order, ship, deliver) and show the same chart by different date roles.
- Publish a "Metrics Catalog" page that shows each measure, aggregation, and definition. Share with a stakeholder for feedback.
Next steps
- Deepen time intelligence: period-over-period, YTD, rolling windows.
- Model joins and relationships: many-to-one dimensions, role-playing dates, and bridge tables for many-to-many.
- Add governance: naming standards, versioning notes, and validation queries.
Mini challenge
Given fields: campaign_id, date, country, impressions, clicks, spend, conversions, revenue, device_type.
- Pick the grain.
- List measures with default aggregations.
- Define CTR, CPC, CPA, and ROAS correctly.
- Name two dimensions and one attribute for display/sorting.
Can your definitions produce the same numbers at daily, weekly, and monthly rollups? If yes, you nailed it.
Quick Test
Take the test to check your understanding. Anyone can take it; only logged-in users get saved progress.