Menu

Dimensional Modeling For Analytics

Learn Dimensional Modeling For Analytics for Data Architect for free: roadmap, examples, subskills, and a skill exam.

Published: January 18, 2026 | Updated: January 18, 2026

Why this matters for Data Architects

Dimensional modeling turns raw data into reliable, fast analytics. As a Data Architect, you design fact and dimension tables so teams can answer business questions consistently—without fragile, one-off SQL. This skill unlocks: clear business definitions, governed star schemas, semantic-model-ready datasets, and trustworthy metrics across BI tools.

Who this is for

  • Data Architects planning analytics platforms and semantic layers.
  • Analytics Engineers and Data Engineers building BI-ready marts.
  • Tech Leads standardizing metrics across teams.

Prerequisites

  • Comfort with SQL joins, aggregations, and window functions.
  • Basic data modeling concepts (entities, keys, normalization).
  • Familiarity with the business domain you’re modeling (sales, product, finance, etc.).

Learning path

Step 1: Define the business processes and the grain of each fact. Write clear grain statements.
Step 2: Design dimensions and hierarchies. Add surrogate keys, attributes, and friendly names.
Step 3: Conform shared dimensions across teams to enable cross-domain analysis.
Step 4: Choose SCD patterns (0/1/2/3) and implement Type 2 where history matters.
Step 5: Decide between transaction, snapshot, and accumulating fact tables.
Step 6: Prepare for the semantic layer: metric definitions, data types, naming, and governance.

Worked examples

1) Define grain and measures for an Orders fact

Business process: e-commerce ordering. Grain: one row per order line (order_id + product_id).

-- Grain: 1 row per order line item
create table dim_product (
  product_sk bigint primary key,
  product_id text unique,
  category text,
  subcategory text
);

create table dim_customer (
  customer_sk bigint primary key,
  customer_id text unique,
  segment text,
  country text
);

create table fact_order_line (
  order_date date,
  order_id text,
  product_sk bigint,
  customer_sk bigint,
  quantity int,
  unit_price numeric(12,2),
  discount numeric(5,2),
  revenue numeric(14,2) generated always as (quantity * unit_price * (1 - discount)) stored,
  constraint pk_fact primary key (order_id, product_sk)
);

Notes: Dimensions use surrogate keys to decouple facts from source system keys. Measures are atomic and additive where possible.

2) Build a Date dimension with hierarchies
create table dim_date as
select
  d::date as date_value,
  extract(year from d)::int as year,
  to_char(d, 'Q')::int as quarter,
  extract(month from d)::int as month,
  to_char(d, 'Mon') as month_name,
  extract(isodow from d)::int as iso_dow,
  to_char(d, 'Dy') as dow_name,
  case when extract(isodow from d) in (6,7) then true else false end as is_weekend
from generate_series(date '2023-01-01', date '2030-12-31', interval '1 day') as g(d);

Hierarchy examples: Year > Quarter > Month > Date. Keep friendly names for BI.

3) SCD Type 2 for Customer dimension
-- dim_customer_scd2
create table dim_customer_scd2 (
  customer_sk bigint generated by default as identity primary key,
  customer_id text,
  segment text,
  country text,
  valid_from timestamp not null,
  valid_to timestamp,
  is_current boolean not null default true
);

-- Close out previous version and insert a new one when attributes change
with changed as (
  select s.* from staging_customer s
  join dim_customer_scd2 d
    on d.customer_id = s.customer_id and d.is_current
  where (s.segment, s.country) is distinct from (d.segment, d.country)
)
update dim_customer_scd2 d
set valid_to = now(), is_current = false
from changed c
where d.customer_id = c.customer_id and d.is_current;

insert into dim_customer_scd2 (customer_id, segment, country, valid_from, valid_to, is_current)
select c.customer_id, c.segment, c.country, now(), null, true
from changed c;

Querying current vs historical: filter is_current = true for latest, or join on date between valid_from and coalesce(valid_to, '9999-12-31').

4) Snapshot vs Accumulating facts
-- Daily inventory snapshot (periodic snapshot)
create table fact_inventory_snapshot as
select
  d.date_value as snapshot_date,
  p.product_sk,
  sum(coalesce(inv.on_hand_qty,0)) as on_hand_qty
from dim_date d
left join product_inventory inv on inv.snapshot_date = d.date_value
left join dim_product p on p.product_id = inv.product_id
group by 1,2;

-- Accumulating fact for order fulfillment (one row per order, milestones update)
create table fact_order_fulfillment (
  order_id text primary key,
  customer_sk bigint,
  order_date date,
  picked_date date,
  packed_date date,
  shipped_date date,
  delivered_date date,
  days_to_ship int generated always as (shipped_date - order_date) stored,
  days_to_deliver int generated always as (delivered_date - order_date) stored
);

Choose snapshots for periodic states (inventory, subscription MRR). Choose accumulating facts for processes with defined milestones.

5) Conformed dimensions across Sales and Support

Goal: Sales and Support both analyze by the same Customer definition.

-- A single conformed dim_customer used by multiple marts
create view sales.dim_customer as
select customer_sk, customer_id, segment, country from dim_customer_scd2 where is_current;

create view support.dim_customer as
select customer_sk, customer_id, segment, country from dim_customer_scd2 where is_current;

-- Both sales.fact_order_line and support.fact_tickets reference customer_sk from the same dimension.

Conformance means shared keys, consistent attribute names, and identical business definitions so cross-domain metrics align.

Drills and exercises

  • Write a grain statement for two processes: Orders and Website Sessions.
  • List additive, semi-additive, and non-additive measures in your domain.
  • Create a dim_date table with fiscal calendar attributes.
  • Implement SCD Type 2 for one dimension; validate historical queries.
  • Build one transaction fact and one periodic snapshot; compare query patterns.
  • Define three governed metrics (formula, grain, filters, time).

Common mistakes and debugging tips

Mistake: Unclear grain

Symptom: Duplicated counts or inconsistent revenue. Fix: Write a one-sentence grain statement for each fact. Verify keys that uniquely identify a row.

Mistake: Snowflaking dimensions prematurely

Symptom: Slow BI queries with many joins. Fix: Prefer denormalized dimensions for analytics unless size is prohibitive.

Mistake: Missing surrogate keys

Symptom: Source key reuse breaks joins. Fix: Use surrogate keys; keep natural keys as attributes.

Mistake: Wrong SCD type

Symptom: Lost history or inflated counts. Fix: Use Type 2 for attributes that change and affect analysis; Type 1 for minor corrections.

Debugging joins

Check row counts at each step. Validate distinct business keys. Look for many-to-many joins—resolve via bridge tables or proper grain alignment.

Mini project: Retail analytics mart

Build a BI-ready retail mart covering orders, customers, and products.

  1. Write grain statements for fact_order_line (order_id + product_id) and fact_inventory_snapshot (date + product).
  2. Create dim_date, dim_customer (SCD2), dim_product.
  3. Load sample data and implement surrogate keys.
  4. Build fact_order_line (transaction) and fact_inventory_snapshot (daily snapshot).
  5. Define three metrics: gross_revenue, avg_order_value, repeat_rate. Document filters and time grains.
  6. Test: a) daily revenue trend, b) revenue by segment and category, c) inventory on-hand vs sales.
Acceptance checklist
  • Facts have clear, documented grain.
  • Dimensions have surrogate keys and friendly names.
  • SCD2 implemented for customer; current and historical queries work.
  • Metrics return identical results across two different queries.

Practical projects

  • Subscription analytics mart: customers, plans, invoices, MRR daily snapshots.
  • Logistics mart: orders, shipments (accumulating fact), carriers, lanes.
  • Product analytics mart: sessions, events, users, feature adoption cohorts.

Subskills

  • Defining Grain And Facts: Choose transaction/snapshot/accumulating types and write precise grain statements with additive measures.
  • Building Dimensions And Hierarchies: Design denormalized dimensions with surrogate keys and intuitive hierarchies (date, product, geography).
  • Conformed Dimensions Across Teams: Standardize shared dimensions and keys to align Sales, Support, Finance, and Product analytics.
  • Slowly Changing Dimensions Patterns: Apply SCD 0/1/2/3 and query with valid_from/valid_to or is_current.
  • Snapshot And Accumulating Facts: Model state over time and milestone-driven processes for performance and clarity.
  • Metric Consistency By Design: Define metrics once; control grain, filters, and aggregation behavior to avoid double counting.
  • Semantic Model Readiness: Ensure types, naming, and relationships are BI/semantic-layer friendly.
  • Star Schema Governance: Establish conventions, ownership, and review to keep marts consistent and trustworthy.

Next steps

  • Apply these patterns to one real business process and validate with stakeholders.
  • Create a short design doc per star: grain, keys, dimensions, measures, and metric definitions.
  • Take the skill exam to check your understanding. Anyone can take it for free; sign in if you want your progress saved.

Have questions about Dimensional Modeling For Analytics?

AI Assistant

Ask questions about this tool