luvv to helpDiscover the Best Free Online Tools
Topic 5 of 8

Slowly Changing Dimensions Basics

Learn Slowly Changing Dimensions Basics for free with explanations, exercises, and a quick test (for BI Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

Business data changes: customers move, products get rebranded, employees change departments. Slowly Changing Dimensions (SCD) let you choose whether to keep history or overwrite attributes in your dimension tables. As a BI Analyst, this directly affects trend accuracy, retention analyses, regulatory reports, and stakeholder trust.

  • Marketing: track how a customer’s region changed over time for cohort analyses.
  • Finance: report sales by the product category that was valid at the time of sale.
  • Operations: measure performance before/after organizational changes.

Who this is for

  • BI Analysts and Analytics Engineers building star schemas.
  • Data Analysts who need trustworthy time-aware reporting.
  • Newcomers to dimensional modeling who want a practical foundation.

Prerequisites

  • Basic SQL (JOINs, CASE, aggregates).
  • Understanding of star schemas (facts and dimensions).
  • Familiarity with unique identifiers and data quality concepts.

Concept explained simply

An SCD is a rule for handling changes in dimension attributes over time. The core decision: Should we overwrite the value, or keep history? Different SCD types encode different answers.

Mental model

Think of a dimension record as a card in a file box. When a change happens, you can either edit the card (overwrite) or add a new card behind it (new version) and date-stamp it. Facts then point to the right card based on the transaction date.

The SCD types (practical guide)

Type 0 — Retain original (no changes)

Use when an attribute should never change historically (e.g., date of birth).

  • Behavior: Ignore incoming changes.
  • Pros: Simple, stable.
  • Cons: Risk of stale data if source is wrong.
Type 1 — Overwrite

Use for corrections or non-analytical attributes (e.g., email typo fix).

  • Behavior: Update existing row; no history.
  • Pros: Simple, storage-efficient.
  • Cons: Past reports reflect new value; can break as-of analysis.
Type 2 — Add new row (track full history)

Use for attributes you must analyze historically (e.g., region, customer segment, product category).

  • Behavior: Insert a new dimension row with new surrogate key; date ranges mark validity.
  • Typical fields: surrogate_key, natural_key, effective_start_date, effective_end_date, is_current, version.
  • Pros: Accurate history; time-travel queries.
  • Cons: More storage and complexity; careful joins required.
Type 3 — Limited history with previous value

Use when you only need the immediate prior value (e.g., Previous Region for simple before/after comparisons).

  • Behavior: Add columns like previous_region, change_date; keep one row.
  • Pros: Simple before/after reporting.
  • Cons: Only one step of history; loses older changes.
Type 4 — History in a mini-dimension

Use for rapidly changing attributes (e.g., customer preferences) stored in a separate mini-dimension linked from the big dimension or fact.

  • Pros: Keeps main dimension slim; flexible changes.
  • Cons: More joins; modeling effort.
Type 6 — Hybrid (1+2+3)

Use when you want full history (Type 2) plus convenience overwrites (Type 1) and a previous value (Type 3) in the current row.

  • Pros: Powerful, supports many queries easily.
  • Cons: Most complex to implement and maintain.

Modeling patterns and key fields

  • Surrogate key (e.g., customer_key): internal unique id per version.
  • Natural/business key (e.g., customer_id): stable real-world id.
  • effective_start_date / effective_end_date: validity window.
  • is_current flag: 1 for current row, 0 for historical.
  • version: 1, 2, 3… per natural key.
  • Audit columns: created_at, updated_at, source_system.

Worked examples

Example 1: Customer address change

Scenario: Customer 42 moves from "NY" to "CA" on 2023-04-10.

Type 1 outcome
customer_key  customer_id  region  is_current  start    end      version
101           42           CA      1          null     null     1
-- NY is lost; all past reports show CA
Type 2 outcome
customer_key  customer_id  region  is_current  start        end          version
101           42           NY      0          2021-09-01   2023-04-09   1
202           42           CA      1          2023-04-10   9999-12-31   2
-- Facts dated before 2023-04-10 join to NY; after join to CA

Example 2: Product category rename

Product A moves from "Snacks" to "Healthy Snacks" on 2022-11-01.

  • If analysis must reflect category at sale time: Type 2.
  • If it’s a pure naming correction and historical recoding is desired: Type 1.
Type 3 alternative
product_key product_id category          previous_category  change_date
5001        A          Healthy Snacks    Snacks            2022-11-01
-- Supports before/after comparisons only

Example 3: Employee department change

Employee 77 moves from Support to Sales on 2024-02-15.

  • Performance over time by department: Type 2.
  • Only need current department: Type 1.
  • Org churn high and detailed history needed: consider Type 2 or a mini-dimension (Type 4) if many fast-changing attributes.

Implementation steps (safe defaults)

  1. Define attribute policies: list each attribute and assign SCD type.
  2. Add fields: surrogate key, natural key, start/end dates, is_current, version.
  3. Build change detection: compare incoming record to current dimension row by natural key and tracked attributes.
  4. Apply actions:
    • Type 1: UPDATE the current row.
    • Type 2: UPDATE old row’s end_date and is_current=0; INSERT a new row with start_date today, is_current=1, version+1.
  5. Audit and dedupe: guard against out-of-order events; handle multiple changes per day if needed.
  6. Test with edge cases: same-day multiple updates, null-to-value transitions, late-arriving facts.

SQL pattern cheat-sheet (conceptual)

Detect changes
-- Compare tracked attributes
WHERE COALESCE(src.region,'') <> COALESCE(dim.region,'')
Type 1 update
UPDATE dim_customer d
SET region = s.region, updated_at = CURRENT_DATE
FROM staging_customer s
WHERE d.customer_id = s.customer_id
  AND COALESCE(d.region,'') <> COALESCE(s.region,'');
Type 2 upsert
-- Close current row
UPDATE dim_customer d
SET effective_end_date = CURRENT_DATE - INTERVAL '1 day', is_current = 0
FROM staging_customer s
WHERE d.customer_id = s.customer_id
  AND d.is_current = 1
  AND (COALESCE(d.region,'') <> COALESCE(s.region,''));

-- Insert new version
INSERT INTO dim_customer (
  customer_key, customer_id, region,
  effective_start_date, effective_end_date, is_current, version
)
SELECT nextval('seq_customer_key'), s.customer_id, s.region,
       CURRENT_DATE, DATE '9999-12-31', 1,
       COALESCE(d.version,0) + 1
FROM staging_customer s
LEFT JOIN dim_customer d
  ON d.customer_id = s.customer_id AND d.is_current = 1
WHERE COALESCE(d.region,'') <> COALESCE(s.region,'') OR d.customer_id IS NULL;

Choosing the right type (checklist)

  • Will past reports need the value that was true at the time? Choose Type 2.
  • Is it a correction that should fix history? Choose Type 1.
  • Only need one prior value for comparison? Consider Type 3.
  • Do attributes change very frequently? Consider Type 4 mini-dimension.
  • Need both convenience and full history? Consider Type 6.

Exercises

Do Exercise 1 (below) and use this checklist to self-validate:

  • You identified which attributes are Type 1 vs Type 2.
  • You produced versioned rows with correct start/end dates.
  • You flagged current rows correctly and kept surrogate keys stable per version.
  • You verified fact-to-dimension joins return expected historical values.

Common mistakes

  • Missing date bounds: Forgetting to close old rows (end_date) leads to two current rows. Self-check: Ensure only one is_current=1 per natural key.
  • Using natural key as primary key: Breaks Type 2 history. Self-check: Confirm separate surrogate key exists.
  • Comparing all columns: Causes false changes from audit fields. Self-check: Compare only tracked attributes.
  • Not handling nulls: Null vs empty string looks like a change. Self-check: Use COALESCE in comparisons.
  • Late-arriving facts: Facts join to wrong version. Self-check: Use transaction date in joins, not load date.

Practical projects

  • Customer dimension v1: Implement Type 2 for region and loyalty_tier; Type 1 for email. Acceptance: Queries can show sales by historical region and current email.
  • Product dimension with rename: Track historical category (Type 2) and maintain a previous_category column (Type 3). Acceptance: Before/after reports show both views.
  • Employee dimension with mini-dimension: Move fast-changing contact preferences to a mini-dimension (Type 4). Acceptance: Main dimension remains slim; joins still support history.

Learning path

  • Before: Star schema basics, surrogate vs natural keys, basic ELT.
  • Now: SCD Types 0/1/2/3/4/6 and when to use them.
  • Next: Late-arriving dimensions, fact versioning, snapshot facts, and slowly changing hierarchies.

Next steps

  • Pick one real dimension and classify attributes by SCD type.
  • Implement a pilot Type 2 pipeline for 1–2 attributes.
  • Validate with sample facts and reconcile with stakeholders.

Mini challenge

Retail scenario: Attributes for Customer — email, region, loyalty_tier, marketing_opt_in.

  • Decide SCD type for each attribute and justify.
  • List the fields you would add to the dimension for Type 2.
  • Describe how a fact on 2024-03-05 should join for a region change that happened 2024-03-01.

When you’re ready, take the quick test at the end of this page. The test is available to everyone; sign in if you want your progress saved.

Practice Exercises

1 exercises to complete

Instructions

You have an existing dim_customer and a staging_customer daily snapshot. Track historical changes to region as Type 2. Treat email as Type 1. Use today as the load date.

  1. Create/confirm fields in dim_customer: customer_key (surrogate), customer_id (natural), email, region, effective_start_date, effective_end_date (default 9999-12-31), is_current (1/0), version.
  2. Detect changes where region differs between staging and the current dim row.
  3. For Type 1 (email): overwrite the current row.
  4. For Type 2 (region): close current row (end_date = today - 1, is_current=0) and insert a new row with start_date = today, is_current=1, version+1.
  5. Test with this data:
dim_customer (before)
customer_key customer_id email              region start        end          is_current version
101          C42        a@x.com            NY     2023-01-01  9999-12-31   1         1

staging_customer (today)
customer_id email              region
C42         a_new@x.com        CA

Expected: email overwrites; region triggers a new version row.

Expected Output
Two rows for C42: (1) historical NY row closed with end_date = today - 1 and is_current=0; (2) new CA row with start_date = today, is_current=1, version=2, and email = a_new@x.com.

Slowly Changing Dimensions Basics — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Slowly Changing Dimensions Basics?

AI Assistant

Ask questions about this tool