How to learn Designing Semantic Models For Self Serve for Data Modeling for BI in BI Developer for free

Why this matters

Self-serve BI rises or falls on the quality of your semantic model. A clean and consistent business layer lets non-technical users answer questions without writing SQL or breaking logic. As a BI Developer, your real tasks include:

Defining standard metrics (e.g., Revenue, AOV, Retention) once and making them reusable everywhere.
Modeling dimensions and hierarchies so slicing and drilling feels natural and safe.
Controlling data relationships and grain to avoid double counting.
Building in guardrails: data types, formats, default filters, and row-level security when needed.

Concept explained simply

A semantic model is a business-friendly layer on top of your warehouse. It exposes dimensions, measures, and relationships with names people understand. Think of it as the “menu” of facts and slices users can confidently combine.

Mental model

Picture a grocery store:

Facts = items sold (transactions at a specific grain)
Dimensions = aisles and labels (date, customer, product, region)
Measures = totals and ratios at the checkout (Revenue, Orders, Conversion Rate)

Your job: decide the right level of detail (grain), connect aisles to items (relationships), and predefine the totals/ratios so anyone can shop for insights without getting lost.

Checklist: What a good semantic model includes

Clear business names and descriptions for all fields
One-hinge grain per fact (no mixed granularity)
Conformed dimensions shared across facts (e.g., Date, Customer)
Curated measures with consistent logic and formatting
Well-defined relationships (cardinality, direction) and role-playing dates if needed
Time intelligence patterns (e.g., MoM, YoY) ready to use
Security rules (row-level) where required
Performance aids (aggregations, summaries) for large data

Design steps (fast path)

Collect questions: Write down top 10 questions users ask. Group by subject area.
Choose fact grain: Decide the most atomic, stable event (e.g., order line, daily campaign).
Select conformed dimensions: Date, Customer, Product, Geography, Channel.
Define measures: Sum/Count first, then ratios (AOV, Conversion Rate) with clear denominators.
Model relationships: Star schema preferred. Avoid many-to-many unless justified and controlled.
Name and document: Business names, tooltips/descriptions, number formats, units.
Add guardrails: Default filters, hidden helper columns, role-playing dates, RLS if needed.
Test with real user slices: Try typical filter combos and edge cases (empty slices, late-arriving data).

Worked examples

Example 1: Ecommerce Sales

Goal: Self-serve questions like “Revenue by week by channel” and “AOV by country.”

Fact: fact_sales at order_line grain (order_id, line_id, product_id, customer_id, quantity, unit_price, discounts, order_timestamp)
Dimensions: dim_date, dim_customer, dim_product, dim_geography, dim_channel
Measures:
- Total Sales = SUM(quantity * unit_price) - SUM(discounts)
- Orders = DISTINCTCOUNT(order_id)
- AOV = [Total Sales] / [Orders]
- Units = SUM(quantity)
Relationships: One-to-many from each dimension to fact. Role-playing dates: order_date, ship_date
Guardrails: Format currency with symbol; hide unit_price raw if exposing Total Sales; default filter to last 12 months

Why this works

Order-line grain avoids double counting when slicing by product. AOV depends on distinct orders, not lines, so it stays consistent across slices. Role-playing dates support both order and fulfillment analyses.

Example 2: Marketing Performance

Goal: “What’s CAC by channel?” and “Which campaigns drive highest CTR over time?”

Fact: fact_campaign_performance at daily-campaign grain (date, campaign_id, channel, impressions, clicks, cost, signups)
Dimensions: dim_date, dim_campaign, dim_channel
Measures:
- CTR = DIVIDE(SUM(clicks), SUM(impressions))
- CPC = DIVIDE(SUM(cost), SUM(clicks))
- CAC = DIVIDE(SUM(cost), SUM(signups))
Guardrails: Use DIVIDE-like safe division to avoid divide-by-zero; format CTR as percentage with 2 decimals; ensure channel is a dimension, not free text

Why this works

Daily-campaign grain allows trending and avoids mixing daily and weekly data. Ratios are consistent since both numerator and denominator aggregate at the same grain.

Example 3: Product Usage (Events)

Goal: “DAU/WAU/MAU by plan” and “Feature adoption by cohort.”

Fact: fact_events at event grain (event_id, user_id, event_name, event_timestamp, product_id)
Dimensions: dim_date, dim_user (with plan), dim_product, dim_feature
Measures:
- DAU = DISTINCTCOUNT(user_id filtered to last 1 day)
- WAU = DISTINCTCOUNT(user_id in last 7 days)
- MAU = DISTINCTCOUNT(user_id in last 30 days)
- Feature Adoption = DISTINCTCOUNT(users who triggered feature) / DISTINCTCOUNT(active users)
Guardrails: Use a calendar table to define rolling windows; ensure event timestamps map to date keys; hide raw event_id from end users

Key patterns and guardrails

Star schema: One fact, many dimensions. Prefer this for self-serve.
Conformed dimensions: Reuse Date/Customer/Product across facts to enable cross-subject analysis.
Single source of truth for measures: Define once, reference everywhere; avoid calculated columns for aggregations if your tool supports measures.
Time intelligence: Predefine YoY, MoM, rolling windows with clear labels and consistent filters.
Role-playing dimensions: Multiple date roles (order, ship, signup) from the same Date table.
Security: Apply row-level security on dimensions (e.g., region) and test with role personas.
Performance: Summaries/aggregations for large facts, hide high-cardinality columns from visuals unless needed, enable incremental refresh where available.

Who this is for

BI Developers, Analytics Engineers, and Data Analysts who need to make reliable, reusable metrics and user-friendly data models for business stakeholders.

Prerequisites

Comfort with star schemas (facts/dimensions) and data types
Basic SQL
Familiarity with at least one BI tool’s modeling concepts (measures, relationships)

Learning path

Define business questions and metrics catalog.
Design star schemas with clear grain and conformed dimensions.
Create measures and time intelligence logic.
Add relationships, role-playing dimensions, and formats.
Implement security and performance features.
User test with typical slices; iterate based on feedback.

Exercises

Do these in your preferred BI tool or on paper. Aim for clarity and consistency.

Exercise 1: Model a Subscription Business

Design a semantic model to answer: “MRR by plan by month,” “Churn rate by cohort,” and “ARPA by region.”

Identify facts, dimensions, and grain.
List measures and their exact formulas.
Describe relationships and any role-playing dates.

When done, compare with the solution in the Exercises panel below.

Exercise 2: Fix a Risky Model

You inherit a model with a single giant table combining orders, customers, and campaigns. Users report double-counted revenue when slicing by product and campaign.

Rewrite the model as facts/dimensions.
Propose at least five concrete improvements that prevent double counting and improve usability.

Open: Solutions and hints for Exercises 1–2

Exercise 1 solution outline

Facts: fact_subscriptions (subscription_id, customer_id, plan_id, start_date, end_date, status), fact_invoices (invoice_id, customer_id, amount, invoice_date), fact_mrr_snapshots at month-end grain
Dimensions: dim_date, dim_customer (region), dim_plan (price, tier)
Measures:
- MRR = SUM(mrr_amount in fact_mrr_snapshots)
- Churned Customers = COUNT of customers with status = churned in month
- Churn Rate = Churned Customers / Customers at start of month
- ARPA = SUM(invoice amount) / DISTINCTCOUNT(active paying accounts)
Relationships: One-to-many from dimensions to facts; role-playing dates for start_date and end_date from dim_date
Guardrails: Use snapshot fact for MRR to avoid re-computing from events; format currency; use safe division

Exercise 2 solution outline

Split into fact_sales (order_line grain), fact_campaign_performance (daily), and dimensions: dim_customer, dim_product, dim_channel, dim_date
Measures defined on facts (Total Sales, Orders, AOV)
Conformed dim_date connects both facts for aligned time analysis
Improvements:
- Eliminate many-to-many by using proper fact-dim joins
- Protect totals with distinct counts and line-grain sales
- Make channel a dimension with controlled values
- Add role-playing dates (order vs ship)
- Hide raw columns that confuse users; expose curated measures
- Set default filter to relevant time window

Self-check checklist

Every fact has a single, documented grain.
Dimensions are conformed and reused; no duplicated Date tables without purpose.
Measures have clear denominators and safe division.
Names are business-friendly; technical helpers are hidden.
Time intelligence is correct when slicing by any dimension.
No double counting when combining facts; if needed, use separate visuals or bridge tables intentionally.

Common mistakes and how to self-check

Mixed granularity in a single fact: Fix by choosing the most atomic stable grain; move aggregates to separate summary tables.
Ratios on calculated columns: Prefer measures so filters recalculate correctly.
Ambiguous relationships: Avoid bi-directional or many-to-many unless you fully understand the impact; document and test edge cases.
Inconsistent definitions: Maintain a metric catalog; one definition per metric.
No role-playing dates: Use distinct relationships for order/ship/signup dates via the same Date table.
Exposing everything: Hide helper columns; keep the user-facing set minimal and labeled.

Practical projects

Sales Self-Serve Pack: Build a star schema and a semantic model with 8–10 measures (Revenue, Orders, AOV, YoY Revenue, Units). Include role-playing order/ship dates and a 12-month default filter.
Marketing Efficiency Model: Daily campaign fact with CTR, CPC, CAC. Add plan-level RLS and verify with two persona test accounts (Manager vs Analyst).
Product Adoption Dashboard Model: Event-grain fact, DAU/WAU/MAU measures, feature adoption ratio. Add formatted measures and hide raw IDs.

Mini challenge

In one paragraph, define a new measure “Contribution Margin” for ecommerce. Include formula components, where they live (fact/dimension), and formatting. Then list 3 slices a business user should try first to validate it.

Quick test

Short quiz to check your understanding. Note: The quick test is available to everyone; if you log in, your progress will be saved.

Next steps

Refine your metric catalog and add time intelligence variants.
Introduce row-level security for one dimension and test with sample roles.
Create a short user guide describing available measures and recommended slicing paths.

Menu

Designing Semantic Models For Self Serve

Table of Contents