Skill Not Found

Why SQL matters for BI Developers

SQL is the core tool BI Developers use to transform raw data into trustworthy dashboards and reports. With SQL you can join facts and dimensions, calculate KPIs, build report-ready datasets, and publish views that BI tools can query quickly and consistently.

Translate business questions into accurate queries
Aggregate and shape data for dashboards and KPI tables
Publish stable views/tables for analysts and stakeholders
Diagnose data discrepancies and performance issues

Quick glossary

Fact table: transaction/behavior data (e.g., orders, events).
Dimension table: descriptive context (e.g., product, customer).
Grain: the level of detail (e.g., per order, per day, per product).
KPI: key performance indicator (e.g., revenue, conversion rate).

Who this is for

Aspiring and junior BI Developers who need practical SQL for dashboards
Analysts who want to level up from ad-hoc queries to report-ready datasets
Anyone building reliable views and KPI tables for BI tools

Prerequisites

Basic familiarity with databases and data types
Comfort reading SELECT, WHERE, GROUP BY
Access to any SQL engine (local or cloud) for practice

Learning path

Write report-ready SELECTs with clean column names, consistent grain, and clear filters
Join facts to dimensions correctly and avoid double counting
Aggregate into KPI tables (daily/weekly/monthly)
Use window functions for rankings, trends, and shares
Publish views for BI consumption and parameterize filters
Debug discrepancies and speed up slow queries

Roadmap with milestones

Milestone 1 — Query shape: SELECT only needed columns, add business-friendly aliases, and lock grain (e.g., per day per product).
Milestone 2 — Reliable joins: Use surrogate keys; LEFT JOIN dimensions; aggregate facts before joining multiple facts.
Milestone 3 — KPI tables: Build daily KPI tables with SUM, COUNT DISTINCT, averages; validate against source counts.
Milestone 4 — Windows: Rankings (ROW_NUMBER), rolling averages, period-over-period comparisons (LAG).
Milestone 5 — BI views: CREATE VIEW with stable schemas, documented columns, safe casts, and defaulted nulls.
Milestone 6 — Quality & performance: Reconcile numbers, use anti-joins to find gaps, push filters early, and minimize scanned data.

Worked examples

1) Report-ready daily revenue with clear grain

-- Grain: 1 row per day
SELECT
  CAST(order_date AS DATE) AS day,
  SUM(order_total) AS revenue,
  COUNT(DISTINCT order_id) AS orders
FROM fact_orders
WHERE order_status = 'completed'
GROUP BY CAST(order_date AS DATE)
ORDER BY day;

Why it works: clear grain (per day), business-friendly aliases, filtered status, safe aggregation.

2) Joining facts to dimensions

-- Keep all orders even if a product is missing in the dimension
SELECT
  o.order_id,
  o.order_date,
  o.order_total,
  p.product_id,
  p.category
FROM fact_orders AS o
LEFT JOIN dim_product AS p
  ON o.product_id = p.product_id
WHERE o.order_status = 'completed';

Tip: Use LEFT JOIN from fact to dimension to avoid losing fact rows when dimension data is incomplete.

3) KPI table: revenue, AOV, conversion rate

-- Grain: 1 row per day
WITH sessions AS (
  SELECT CAST(session_date AS DATE) AS day, COUNT(DISTINCT session_id) AS sessions
  FROM fact_sessions
  GROUP BY CAST(session_date AS DATE)
),
orders AS (
  SELECT CAST(order_date AS DATE) AS day,
         SUM(order_total) AS revenue,
         COUNT(DISTINCT order_id) AS orders,
         COUNT(DISTINCT customer_id) AS buyers
  FROM fact_orders
  WHERE order_status = 'completed'
  GROUP BY CAST(order_date AS DATE)
)
SELECT
  s.day,
  o.revenue,
  o.orders,
  s.sessions,
  CASE WHEN o.orders > 0 THEN o.revenue * 1.0 / o.orders ELSE 0 END AS aov,
  CASE WHEN s.sessions > 0 THEN o.orders * 1.0 / s.sessions ELSE 0 END AS conversion_rate
FROM sessions s
LEFT JOIN orders o USING(day)
ORDER BY s.day;

Note: Avoid joining two facts at detail level; aggregate each first by the shared grain (day).

4) Window functions: top 3 products per category (by monthly revenue)

WITH monthly AS (
  SELECT
    DATE_TRUNC('month', o.order_date) AS month,
    p.category,
    p.product_id,
    SUM(o.order_total) AS revenue
  FROM fact_orders o
  LEFT JOIN dim_product p ON o.product_id = p.product_id
  WHERE o.order_status = 'completed'
  GROUP BY 1,2,3
)
SELECT *
FROM (
  SELECT
    month,
    category,
    product_id,
    revenue,
    ROW_NUMBER() OVER (PARTITION BY month, category ORDER BY revenue DESC) AS rnk
  FROM monthly
) t
WHERE rnk <= 3
ORDER BY month, category, rnk;

Use ROW_NUMBER for deterministic top-N per partition.

5) BI-friendly view with parameter-ready columns

CREATE VIEW vw_daily_sales AS
SELECT
  CAST(order_date AS DATE) AS day,
  COALESCE(SUM(CASE WHEN order_status = 'completed' THEN order_total END), 0) AS revenue,
  COALESCE(COUNT(DISTINCT CASE WHEN order_status = 'completed' THEN order_id END), 0) AS orders
FROM fact_orders
GROUP BY CAST(order_date AS DATE);

Stable schema, clear names, and defaults (COALESCE) help BI tools and reduce null-related issues.

Extra: 7-day rolling average revenue

SELECT
  day,
  revenue,
  AVG(revenue) OVER (
    ORDER BY day
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) AS revenue_7d_avg
FROM vw_daily_sales
ORDER BY day;

Drills and exercises

Create a daily orders view with revenue, orders, and unique buyers (COUNT DISTINCT)
Join orders to products and customers; confirm row counts before/after join
Build a monthly KPI table and validate against daily sums
Compute 7-day rolling conversion rate using window functions
Write a parameterized filter using placeholders like :start_date and :end_date
Find orphan facts: orders whose product_id has no match in dim_product
Refactor a slow query to push filters before GROUP BY and remove unneeded columns

Common mistakes and debugging tips

1) Double counting after joins

Symptom: KPIs inflate after joining dimensions with one-to-many relationships. Fix: aggregate the fact table to the target grain first, or join via a unique dimension key ensuring 1:1 at the grain you need.

2) Unclear grain

Symptom: Mixed daily and order-level columns in the same SELECT. Fix: decide the grain (e.g., per day), aggregate all detail columns first, and alias columns clearly.

3) Date/time mismatches

Symptom: Numbers differ by day vs dashboard. Fix: cast timestamps to DATE consistently; define timezone and use the same boundary (inclusive end dates: BETWEEN :start AND :end + 1 day minus 1 second pattern if needed in your engine).

4) LEFT JOIN turning into INNER JOIN

Symptom: Fewer rows after a LEFT JOIN. Fix: avoid WHERE filters on dimension columns; move them to the JOIN condition or use predicates that preserve nulls.

5) Slow queries

Tips: select only needed columns, filter early, aggregate early, avoid wildcards on large tables, and precompute heavy metrics into summary tables when used repeatedly.

Mini project: Executive Sales Dashboard dataset

Define metrics: daily revenue, orders, AOV, top 5 products by revenue per month.
Build base views: vw_daily_sales and vw_product_revenue_monthly (aggregate before joins).
Add trends: compute 7-day rolling average revenue and month-over-month growth using LAG.
Publish consumption views: one daily KPI view and one top-products-per-month view.
Validate: reconcile totals against raw fact tables; test parameterized date ranges.
Performance pass: restrict columns, push filters, and confirm scan sizes drop.

Deliverables checklist

vw_daily_sales with revenue, orders, buyers, AOV
vw_top_products_monthly with ROW_NUMBER filter rnk ≤ 5
Validation queries proving daily sums match monthly totals
Notes on any discrepancies and how you fixed them

Subskills

Writing Report Ready Queries — Shape result sets with clear grain, aliases, and safe filters
Joins Across Facts And Dimensions — Choose correct join types and preserve fact rows
Aggregations For KPI Tables — Build daily/weekly/monthly KPI aggregates
Window Functions For Rankings And Trends — Use ROW_NUMBER, RANK, LAG/LEAD, moving averages
Building Views For BI Consumption — Create stable, documented views with safe types
Parameterized Filters And Date Logic — Implement flexible date ranges and filters
Debugging Data Discrepancies — Reconcile numbers and isolate root causes
Query Performance Basics For BI — Push filters early and reduce scanned data

Next steps

Finish the drills, then build the mini project and validate every metric
Aim for consistent grains and reusable views across dashboards
Take the skill exam below. Anyone can try; logged-in users get saved progress.

Menu

SQL

Table of Contents

Why SQL matters for BI Developers

Who this is for

Prerequisites

Learning path

Roadmap with milestones

Worked examples

1) Report-ready daily revenue with clear grain

2) Joining facts to dimensions

3) KPI table: revenue, AOV, conversion rate

4) Window functions: top 3 products per category (by monthly revenue)

5) BI-friendly view with parameter-ready columns

Drills and exercises

Common mistakes and debugging tips

Mini project: Executive Sales Dashboard dataset

Subskills

Next steps

SQL — Skill Exam

Topics

Writing Report Ready Queries

Joins Across Facts And Dimensions

Aggregations For KPI Tables

Window Functions For Rankings And Trends

Building Views For BI Consumption

Parameterized Filters And Date Logic

Debugging Data Discrepancies

Query Performance Basics For BI

Have questions about SQL?

AI Assistant