How to learn Building Reusable Views For BI for SQL in BI Analyst for free

Who this is for

BI Analysts and data practitioners who build dashboards and reports and want stable, well-documented SQL views that the whole team can reuse safely.

Prerequisites

Comfort with SELECT, JOIN, GROUP BY, and CTEs.
Basic familiarity with your warehouse (PostgreSQL/Snowflake/BigQuery/Redshift syntax differences).
Understanding of your company’s core entities (customers, orders, products).

Why this matters

In BI work you repeatedly need clean, consistent data for dashboards and ad-hoc questions. Reusable SQL views give everyone a single, trustworthy source. Real tasks where this helps:

Providing a single definition of metrics (e.g., Daily Active Customers) across multiple dashboards.
Hiding raw data complexity (deduplication, type casting, nullable fields) behind a clean interface.
Speeding up onboarding: new analysts query one documented view instead of re-learning joins.

Concept explained simply

A reusable view is a named SQL query you can SELECT from as if it were a table. It standardizes logic (joins, filters, types, naming) so others don’t have to rebuild it. Think of it as a contract: stable columns with clear meanings that don’t surprise downstream users.

Mental model

Imagine three layers:

Staging (raw_clean): lightly cleaned source tables.
Base views (analytics.base_*): consistent schemas, business-safe defaults (data types, null-handling).
Consumer views (analytics.v_*): user-friendly surfaces for dashboards (pre-aggregations, labels).

Base views change rarely and prioritize correctness. Consumer views are tailored for specific use cases but built on base views to avoid duplication.

Design principles for reusable BI views

Stability first: prefer additive changes; avoid breaking renames or type changes.
Explicit columns: never SELECT *; list columns with clear, semantic names (snake_case).
Type safety: CAST to expected types; normalize booleans and timestamps; set time zones explicitly if applicable.
Null handling: use COALESCE and default flags to avoid surprising NULLs in dashboards.
Document in-line: add a header comment describing purpose, grain, and column meanings.
Two-tier approach: build a base view, then thin consumer views on top for specific metrics.
Performance: push filters to source, avoid unnecessary DISTINCT, and pre-aggregate where appropriate.
Reproducibility: use CREATE OR REPLACE VIEW; isolate logic in CTEs with clear names.
Security: expose only necessary columns; exclude PII or mask it as required by policy.

Worked examples

Example 1: Customer dimension base view (dedupe, types, flags)

-- Purpose: Clean customer dimension
-- Grain: 1 row per customer_id
-- Columns: customer_id, full_name, email_hash, country_code, is_active, created_date
CREATE OR REPLACE VIEW analytics.base_customers AS
WITH src AS (
  SELECT 
    c.customer_id,
    TRIM(c.first_name || ' ' || c.last_name) AS full_name,
    /* Avoid exposing raw email; hash or mask based on policy */
    MD5(LOWER(TRIM(c.email))) AS email_hash,
    UPPER(c.country_code) AS country_code,
    c.status,
    CAST(c.created_at AS DATE) AS created_date,
    ROW_NUMBER() OVER (PARTITION BY c.customer_id ORDER BY c.updated_at DESC) AS rn
  FROM raw.customers c
)
SELECT 
  customer_id,
  NULLIF(full_name, '') AS full_name,
  email_hash,
  country_code,
  CASE WHEN LOWER(status) = 'active' THEN TRUE ELSE FALSE END AS is_active,
  created_date
FROM src
WHERE rn = 1;

Why it’s reusable: one row per customer, safe types, masked email, consistent is_active flag, deduped by latest record.

Example 2: Daily order metrics (consumer view on base data)

-- Purpose: Daily commerce metrics for dashboards
-- Grain: 1 row per order_date
-- Depends on: analytics.base_orders, analytics.base_order_items
CREATE OR REPLACE VIEW analytics.v_daily_order_metrics AS
WITH o AS (
  SELECT order_id, customer_id, CAST(order_timestamp AS DATE) AS order_date, order_status
  FROM analytics.base_orders
  WHERE order_status IN ('paid','fulfilled')
),
items AS (
  SELECT order_id, SUM(quantity) AS items_qty, SUM(quantity * unit_price) AS gross_revenue
  FROM analytics.base_order_items
  GROUP BY order_id
)
SELECT 
  o.order_date,
  COUNT(DISTINCT o.order_id) AS orders,
  COUNT(DISTINCT o.customer_id) AS unique_customers,
  COALESCE(SUM(items.items_qty),0) AS items_sold,
  COALESCE(SUM(items.gross_revenue),0.0) AS revenue
FROM o
LEFT JOIN items USING (order_id)
GROUP BY o.order_date;

Why it’s reusable: standard daily grain, clear metric names, business-status filtered, and safe COALESCE for dashboard visuals.

Example 3: Two-tier approach (base view + thin consumer view)

-- Base: one row per product with normalized types
CREATE OR REPLACE VIEW analytics.base_products AS
SELECT 
  p.product_id,
  TRIM(p.product_name) AS product_name,
  UPPER(p.category) AS category,
  CAST(p.active AS BOOLEAN) AS is_active,
  CAST(p.created_at AS DATE) AS created_date
FROM raw.products p;

-- Consumer: active products only, for UI filters
CREATE OR REPLACE VIEW analytics.v_active_products AS
SELECT product_id, product_name, category
FROM analytics.base_products
WHERE is_active = TRUE;

Why it’s reusable: base view standardizes schema; consumer view tailors a common filter without duplicating normalization logic.

Build steps you can follow

Define grain and purpose: one row per what? (customer, order_date, product).
List columns and types: write them down; assign names and expected types.
Draft SQL with CTEs: raw input, cleanup, business rules, final select.
Harden: COALESCE nulls, cast types, filter statuses, remove duplicates.
Document: add header comment with grain, purpose, dependencies.
Test: count, distinct keys, spot-check joins, compare to known totals.
Release: CREATE OR REPLACE VIEW under analytics schema; avoid breaking changes.

Exercises

These mirror the exercises below. Try them in your warehouse; then open the solutions.

Exercise 1 (ex1): Build a clean customer dimension view with dedupe and masking.
Exercise 2 (ex2): Build a daily order metrics view from orders and order_items.

Checklist before you consider a view "reusable"

Purpose and grain stated in a comment.
No SELECT *; columns explicitly listed.
Primary key or uniqueness guaranteed at the grain.
Types casted; timestamps clarified; nulls handled.
Filters reflect business rules (e.g., paid orders only).
Depends only on stable upstream objects.
Performance is acceptable for dashboard refresh SLAs.

Common mistakes and self-check

Using SELECT *: downstream breaks when new columns appear or order changes. Fix: list columns explicitly.
Missing grain definition: duplicates appear. Fix: enforce uniqueness (ROW_NUMBER then rn=1).
Leaking PII: exposing raw emails. Fix: mask or hash; expose only needed fields.
Implicit types/time zones: numbers as strings or mixed tz timestamps. Fix: CAST and standardize TZ.
Over-aggregation: double-counting revenue by joining items incorrectly. Fix: aggregate items first, then join to orders.
Hard-coded dates: views that only work for last month. Fix: make logic relative or push date filters to consumers.

Self-check

Can you describe the grain in one sentence?
Is there a column (or combo) that is unique at that grain?
Are all numeric columns correct type and scale?
If a column contains nulls, is that intentional and documented?
Does COUNT(*) from the view match known control totals when filtered?

Practical projects

Analytics customer mart: Create base_customers and v_active_customers with dedupe, PII masking, and activity flags. Acceptance: uniqueness by customer_id; no raw PII exposed.
Sales daily metrics: Build v_daily_order_metrics with orders, customers, items, revenue. Acceptance: matches finance totals for a sampled week.
Product browse layer: Base products + v_active_products + v_category_rollup. Acceptance: category-level counts match product catalog.

Learning path

Start: base entity views (customers, products, orders).
Then: consumer-friendly metric views (daily metrics, retention cohorts).
Next: incremental models or materialized views if needed for performance.
Finally: semantic layer naming consistency across dashboards.

Quick test note

The quick test at the end is available to everyone. Only logged-in users will have their progress saved.

Mini challenge

Build a two-tier view for "active subscribers":

Create base_subscriptions with one row per subscription_id, normalized statuses, and start/end dates.
Create v_active_subscriptions filtering to currently active records and expose subscriber_id and plan_name.
Prove correctness: show that counts per day match billing exports for two random dates.

Menu

Building Reusable Views For BI

Table of Contents

Who this is for

Prerequisites

Why this matters

Concept explained simply

Mental model

Design principles for reusable BI views

Worked examples

Build steps you can follow

Exercises

Common mistakes and self-check

Practical projects

Learning path

Quick test note

Mini challenge

Practice Exercises

Build a reusable customer dimension view

Instructions

Expected Output

Build daily order metrics for dashboards

Building Reusable Views For BI — Quick Test

Have questions about Building Reusable Views For BI?

AI Assistant