Why this matters
BI dashboards live or die by how well dimensions and hierarchies are modeled. A clean Product, Customer, and Date dimension with drill paths like Category → Subcategory → Product lets stakeholders slice facts quickly and trust results.
- Make sales drill-downs fast and consistent across reports.
- Track history when attributes change (e.g., product rebranding).
- Standardize calendars (fiscal vs calendar) and geography levels.
- Enable role-playing dates (order, ship, delivery) without duplicating logic.
Concept explained simply
A dimension table describes the who/what/where/when of your data. Facts hold numbers; dimensions hold words. Hierarchies are the natural drill paths within a dimension (like Country → State → City).
Mental model
Think of dimensions as dictionaries, hierarchies as nested folders
Facts are numeric entries that reference dictionary entries (dimensions) via surrogate keys. A hierarchy is how folders nest inside each other so you can drill from top-level to detail.
Key terms
- Star schema: facts in the center, dimensions around. Preferred for BI simplicity and speed.
- Surrogate key (SK): integer primary key for dimensions. Facts reference SK, not business keys.
- Slowly Changing Dimension (SCD): how you handle attribute changes over time.
- Type 0: do nothing (fixed).
- Type 1: overwrite value (no history).
- Type 2: add a new row with new SK and effective dates (keep history).
- Conformed dimension: one shared dimension used by multiple fact tables (e.g., shared Date dimension).
- Role-playing dimension: same dimension used for different roles (Order Date, Ship Date).
- Junk dimension: bundles low-cardinality flags into one small dimension.
- Degenerate dimension: dimension attribute stored in the fact (e.g., Order Number) with no separate dimension table.
Design steps you can follow
- List business questions: what drill paths and filters do users need?
- Enumerate attributes: names, codes, descriptions, statuses, dates.
- Choose keys: add a surrogate key; keep the business key as a natural key column.
- Decide SCD type per attribute: which keep history (Type 2) vs overwrite (Type 1).
- Model hierarchies: explicit columns for each level, or parent-child with path/level fields.
- Add housekeeping: effective_start_date, effective_end_date, is_current, record_source, load_ts.
- Test with sample facts: can you filter, drill, and time travel correctly?
- Document: attribute definitions, SCD rules, default/unknown member handling.
Worked examples
1) Product dimension with Category hierarchy and SCD2
Attributes: product_code (business key), product_name, brand, category, subcategory, size, color. Use surrogate key: product_sk. Track name and brand historically (Type 2). Category levels are usually Type 1 unless reorg history is needed.
Snapshot of rows when a product is rebranded
product_sk | product_code | product_name | brand | category | subcategory | eff_start | eff_end | is_current
---------: | -----------: | -------------- | -------- | -------- | ---------- | --------- | --------- | ---------
1001 | P-00045 | Sport Tee | AthletiQ | Apparel | T-Shirts | 2023-01-01| 2023-08-14| 0
1109 | P-00045 | Sport Tee Plus | FitMax | Apparel | T-Shirts | 2023-08-15| 9999-12-31| 1
2) Date dimension with role-playing
Single Date dimension supports Order Date, Ship Date, Delivery Date. Include calendar and fiscal attributes.
- date_sk (YYYYMMDD integer), full_date, day_of_week, week_of_year, month, quarter, year
- fiscal_week, fiscal_period, fiscal_quarter, fiscal_year (e.g., fiscal year starting in April)
- is_weekend, is_holiday
How it plays different roles
In the fact table, include order_date_sk, ship_date_sk, delivery_date_sk as foreign keys to the same Date dimension.
3) Organization hierarchy (parent-child)
For departments with uneven depth, use a parent-child pattern with helper columns for BI performance.
- dept_sk, dept_code (business key), dept_name, parent_dept_sk
- level_num, path_text (like 'HQ/EMEA/UK/London'), root_dept_sk
Why add level and path?
Many BI tools need explicit levels to aggregate correctly and to support ragged hierarchies (branches with different depths).
Practical modeling patterns
- Unknown member: add a default row with surrogate key like -1 and description 'Unknown'. Map early-arriving facts here.
- Bridge table for many-to-many: when a fact relates to multiple dimension members (e.g., an order impacted by multiple promotions), use a fact-to-bridge-to-dimension design with allocation weights if needed.
- Snowflake carefully: keep star schemas denormalized for BI speed; snowflake only if a sub-dimension is reused broadly and does not harm usability.
Common mistakes and how to self-check
- Using business keys in facts: switch to surrogate keys to avoid key churn and to support SCD2.
- Forgetting effective dates on Type 2: add eff_start, eff_end, is_current.
- Mixing hierarchies across different concepts: keep clean, single-concept hierarchies per dimension.
- Not planning unknown/default records: define them explicitly and document mapping rules.
- Over-snowflaking: if users struggle to drill, flatten back to star.
Self-check mini queries (conceptual)
- Orphans: count fact rows where foreign key not found in dimension (excluding unknown key).
- Time travel: pick a product_code and verify only one is_current row and correct historical rows by date.
- Hierarchy: confirm each child level rolls up to exactly one parent (unless bridge is used by design).
Pre-publish checklist
- [ ] Each dimension has a surrogate key and a business key column.
- [ ] SCD rules are set per attribute and implemented consistently.
- [ ] Unknown/default members exist and are used for early-arriving facts.
- [ ] Hierarchies are testable in BI (all levels present or properly ragged with helpers).
- [ ] Conformed dimensions are reused across facts with identical meanings.
Exercises
Do these practical tasks. Then compare with the solutions.
- Exercise 1: Design a Product dimension with a Category hierarchy and SCD2 for name and brand. Provide column list, keys, SCD types, and a two-row example showing a historical change.
- Exercise 2: Create Date dimension rows for a one-week period where fiscal year starts in April. Provide columns to support role-playing and show 3 example records.
Quality checklist for your answers
- [ ] Surrogate key present and used as primary key.
- [ ] Business key kept and uniquely identifies the natural entity.
- [ ] SCD types justified per attribute.
- [ ] Hierarchy levels are explicit, or parent-child with helper columns.
- [ ] Unknown/default member included.
Mini challenge
You are modeling Customer for a subscription business. Requirements: drill from Region → Country → City → Customer; keep history of customer_status and plan_name; overwrite email changes; support analysis by signup_date and cancellation_date.
- List the attributes and mark each as Type 1 or Type 2.
- Propose the unknown member row for Customer and for Date.
- Explain how you will support both signup_date and cancellation_date using a single Date dimension.
Hints
- Role-playing Date for both signup and cancellation dates.
- Type 2 on plan_name and customer_status; Type 1 on email.
- Geography can be a simple hierarchy in Customer or a separate Geography dimension if reused widely.
Who this is for
- BI Developers and Analytics Engineers designing star schemas for dashboards.
- Data Analysts who need reliable, drillable dimensions.
- Data Modelers standardizing conformed dimensions across teams.
Prerequisites
- Basic SQL (joins, primary/foreign keys).
- Understanding of star schema and fact vs dimension tables.
- Familiarity with your BI tool's hierarchy/drill features.
Learning path
- Refresh star schema fundamentals and surrogate keys.
- Practice SCD types with small sample tables.
- Model one balanced hierarchy (e.g., Product) and one ragged hierarchy (e.g., Org).
- Implement a conformed Date dimension and use it in two facts as different roles.
- Add unknown members and test early-arriving facts.
Practical projects
- Retail mart: Product, Customer, Date, Store, Promotion dimensions; Sales fact. Drill Category → Subcategory → Product.
- Subscription mart: Customer, Plan, Date; MRR/Churn fact. Track plan changes via Type 2.
- HR mart: Employee, Department (parent-child); Headcount fact. Ragged org hierarchy with level and path helpers.
Next steps
- Review your current BI model and mark each attribute with its SCD type.
- Add unknown/default members where missing and update ETL to use them.
- Refactor one snowflaked hierarchy back to a star and measure query speed.
Quick Test
Take the quick test below to check understanding. Everyone can take it for free; if you are logged in, your progress will be saved.