luvv to helpDiscover the Best Free Online Tools
Topic 1 of 8

Fact And Dimension Tables

Learn Fact And Dimension Tables for free with explanations, exercises, and a quick test (for BI Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

Fact and dimension tables are the backbone of star schemas. As a BI Analyst, you will define what gets measured, how it can be sliced, and how reports aggregate correctly. Real tasks include:

  • Designing a Sales star schema for dashboards
  • Choosing the right grain for events vs. monthly snapshots
  • Handling changing product or customer attributes over time
  • Writing queries that join facts to dimensions without duplicates

Quick note about progress: the quick test on this page is available to everyone; if you are logged in, your progress will be saved automatically.

Concept explained simply

• Fact table: a collection of business events or numeric observations. Examples: sales line items, pageviews, monthly account balances.
• Dimension table: descriptive context used to slice and filter facts. Examples: date, customer, product, store, campaign.

Mental model

Think of a fact as a row in a diary of events. Dimensions are labeled sticky notes you attach to each event so you can later group by them. The most important decision is the grain: the exact definition of one row in the fact table.

Grain — the irreversible decision

State the grain in one precise sentence before adding columns. Examples:

  • Sales fact: one row per order line item
  • Pageview fact: one row per pageview event
  • Subscription snapshot: one row per subscription per month

Change the grain later and all downstream metrics change. Document it in the table comment.

Keys that make joins safe
  • Use surrogate keys (integer IDs) in dimensions to decouple from source IDs and to manage history (SCD).
  • Fact tables store foreign keys to dimensions (product_key, customer_key, date_key, etc.).
  • Degenerate dimensions: identifiers like order_number can live in the fact when there is no separate dimension.
Types of facts and measures
  • Transactional fact: one row per event. Measures are typically fully additive.
  • Periodic snapshot: regular intervals (e.g., daily inventory, monthly MRR). Some measures are semi-additive (add across stores but not across time).
  • Accumulating snapshot: tracks a workflow with milestones (e.g., order from placed to delivered).
  • Measures: additive (quantity), semi-additive (balance), non-additive (ratios like conversion rate).
Slowly Changing Dimensions (SCD)
  • Type 1: overwrite value; no history. Good for correcting errors.
  • Type 2: add a new row per change; keeps history with effective dates and current flag.
  • Type 0: never change (e.g., date dimension).

Designing facts and dimensions (step-by-step)

  1. Pick the business process (sales, billing, support tickets, web events).
  2. Declare the grain in one sentence.
  3. Choose measures consistent with the grain (e.g., sales_amount, quantity).
  4. Identify dimensions to analyze by (date, product, customer, store, campaign).
  5. Assign keys: surrogate keys in dimensions; FKs in facts. Include degenerate identifiers when needed.
  6. Plan SCD strategy per dimension attribute.
  7. Handle unknowns: provide default dimension rows like “Unknown Product”.
  8. Conform dimensions so multiple fact tables can share them (same product_key across sales and returns).

Worked examples

Example 1 — Retail sales (transactional fact)

Grain: one row per order line item.

Fact_Sales columns: date_key, product_key, customer_key, store_key, promo_key, order_number (degenerate), line_number, quantity, sales_amount, discount_amount, cost_amount.

Dimensions: Dim_Date, Dim_Product (SCD2 for category/brand changes), Dim_Customer (SCD2 for address tier), Dim_Store, Dim_Promo.

-- Query: sales by category for last month
SELECT d.month_name,
       p.category,
       SUM(f.sales_amount) AS revenue,
       SUM(f.quantity)     AS units
FROM Fact_Sales f
JOIN Dim_Date d     ON f.date_key = d.date_key
JOIN Dim_Product p  ON f.product_key = p.product_key
WHERE d.is_last_month = 1
GROUP BY d.month_name, p.category;
Example 2 — Subscription MRR (periodic snapshot)

Grain: one row per subscription per month.

Fact_Subscription_Snapshot: snapshot_month_key, customer_key, plan_key, mrr, is_active_flag, seats, churn_risk_score.

Notes: mrr is semi-additive (add across customers, not across months without care). Dim_Customer and Dim_Plan are SCD2 so historical MRR aligns to historical attributes.

Example 3 — Web analytics pageviews (transactional fact)

Grain: one row per pageview.

Fact_Pageview: date_key, time_key, user_key, session_key, page_key, referrer_key, device_key, time_on_page_seconds, scroll_depth_pct.

Notes: Bounce rate is a derived ratio, not stored directly in the fact. Store raw counts so ratios can be recomputed correctly.

Pre-flight checklists

Use these before building or publishing.

Grain and measures

  • I can state the fact table grain in one sentence.
  • All measures make sense at the chosen grain.
  • No derived ratios are stored as measures unless justified.

Keys and joins

  • Every fact FK points to a conformed dimension key.
  • Degenerate identifiers (like order_number) are in the fact when no dimension is needed.
  • Unknown dimension rows exist to prevent null FKs.

History strategy

  • Each dimension has a defined SCD policy (Type 1/2/0).
  • Surrogate keys support SCD2 where required.
  • Measures that are semi-additive are clearly labeled in docs.

Exercises

Do these after reading the examples. Suggested time: 20–30 minutes.

Exercise 1 — Define grain and columns

Scenario: You need a returns fact for retail. Analysts want to see returned amount by reason, by product, by month. A single order can have multiple returned items and multiple reasons.

  1. Write the grain sentence.
  2. List the fact table columns: FKs, measures, degenerate IDs.
  3. List the required dimensions and SCD needs.

Exercise 2 — Choose SCD types

For each attribute, pick SCD Type 1 or Type 2 and state why:

  • Product brand
  • Customer email
  • Store manager name
  • Plan monthly price

Self-check (quick)

  • My grain statement is unambiguous and testable.
  • Measures are additive/semi-additive classifications are noted.
  • Each dimension’s SCD choice is justified.
Need a hint?
  • If multiple returned items per order can exist, your grain likely includes the line item.
  • If analysts need historical context (e.g., price at the time), lean SCD2.

Common mistakes and how to self-check

  • Vague grain: If you cannot say the grain in one sentence, stop and define it. Test by asking: would two events ever share the same row?
  • Ratios as measures: Storing conversion_rate invites double-counting. Store counts and compute ratios in queries.
  • Missing unknown dimensions: Null FKs break joins. Add default rows like product_key = 0 for Unknown.
  • Overusing Type 2: Track history only when business logic requires it; Type 2 increases table size and join complexity.
  • Mixing grains in one fact: Do not blend daily snapshots with transactions in the same table.
Self-check query idea
-- Look for orphaned keys
SELECT COUNT(*) AS orphan_rows
FROM Fact_Sales f
LEFT JOIN Dim_Product p ON f.product_key = p.product_key
WHERE p.product_key IS NULL;

Practical projects

  1. Design a Sales and Returns star schema. Deliver: grain statements, column lists, SCD choices, and 3 example queries.
  2. Build a Monthly Subscription Snapshot. Deliver: snapshot job spec, schema DDL, and a report showing churn and expansion by cohort.
  3. Create a Web Events star. Deliver: event dictionary, Fact_Pageview schema, Dim_User privacy-safe attributes, and a dashboard mockup.

Quick test

Take the quick test to check understanding. Everyone can take it. If you are logged in, your answers and progress will be saved.

Mini challenge

Design a Promotions effectiveness model:

  • Pick a grain for Fact_Promo_Performance.
  • List 3 measures and 4 dimensions.
  • Decide which dimension attributes need SCD2.
Example approach (hidden)

Grain: one row per store per product per day with promo status. Measures: sales_lift_amount, units, promo_days_active. Dimensions: Date, Store, Product, Promo. Product category uses SCD2.

Who this is for

  • BI Analysts and Analytics Engineers building dashboards and models
  • Data Analysts translating business metrics into warehouse schemas

Prerequisites

  • Comfort with SQL joins and aggregations
  • Basic understanding of data types and primary/foreign keys

Learning path

  1. Grain and measures fundamentals
  2. Fact types: transactional, snapshot, accumulating
  3. Dimensions and SCD strategies
  4. Conformed dimensions across multiple fact tables

Next steps

  • Apply these steps to a real dataset and document your grain statements
  • Review with stakeholders to confirm dimensions match how they analyze the business
  • Take the quick test and revisit weak areas identified

Practice Exercises

2 exercises to complete

Instructions

Scenario: You need a returns fact for retail. Analysts want to see returned amount by reason, by product, by month. A single order can have multiple returned items and multiple reasons.

  1. Write the grain sentence.
  2. List the fact table columns: FKs, measures, degenerate IDs.
  3. List the required dimensions and SCD needs.
Expected Output
A clear grain sentence, column list grouped by FKs/measures/degenerate IDs, and a dimension list with SCD choices.

Fact And Dimension Tables — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Fact And Dimension Tables?

AI Assistant

Ask questions about this tool