luvv to helpDiscover the Best Free Online Tools
Topic 5 of 8

Building Views For BI Consumption

Learn Building Views For BI Consumption for free with explanations, exercises, and a quick test (for BI Developer).

Published: December 24, 2025 | Updated: December 24, 2025

Why this matters

As a BI Developer, your dashboards, reports, and data models depend on stable, well-designed SQL views. Views act as the contract between raw data and BI tools: they standardize names and types, encode business logic once, and protect consumers from breaking changes. Done right, views speed up development, reduce errors, and improve performance.

  • Real tasks you will face: publishing clean subject-area views (sales, customers, products), creating aggregated views for dashboards, enforcing row-level filters, and balancing performance with maintainability.
  • Deliverables: consistent column naming and data types, reliable joins, clear metric definitions, and predictable refresh behavior.

Concept explained simply

A view is a saved SELECT. Think of it as a clean “window” into complex data. Consumers query the view, not your tangled joins.

Mental model

  • Input: multiple raw tables with quirks, nulls, and changing schemas.
  • Transform: pick needed fields, standardize types, compute metrics, and make joins explicit.
  • Output: a consistent, query-ready table shape that BI tools can trust.
When to use materialized views

Use materialized views when queries are expensive or need pre-aggregation. They store results and can be refreshed on a schedule or manually. Great for dashboard speed. Use normal (logical) views for light transforms or when freshness must be real-time.

Design checklist for BI-friendly views

  • Purpose-driven: one domain per view (e.g., sales_orders, daily_sales).
  • Stable schema: avoid removing/renaming columns without versioning. Do not use SELECT *.
  • Clean naming: snake_case or lowerCamelCase; descriptive, not cryptic (e.g., customer_id, order_date).
  • Explicit data types: cast dates, numerics, and booleans. Avoid text for metrics.
  • Deterministic logic: no volatile functions in the definition (e.g., avoid current_date filters inside the view).
  • Join correctness: use correct keys; prevent double counting with pre-aggregations or DISTINCT appropriately.
  • Null handling: COALESCE where business-safe; keep raw nulls if meaning is unknown.
  • Performance: project only needed columns, pre-aggregate for dashboards, consider materialization, and ensure indexes on join keys in base tables.
  • Security: centralize row-level filters in views or via dedicated security views.
  • Documentation: add column comments where supported; include a header comment describing intent and grain.

Worked examples

Assume tables: customers, orders, order_items, products. Dialect uses PostgreSQL-style syntax; adjust types/quotes for your engine as needed.

Example 1 — Clean subject-area view (order-level metrics)
-- Grain: one row per order
CREATE OR REPLACE VIEW analytics.sales_orders AS
WITH item_totals AS (
  SELECT oi.order_id,
         SUM(oi.quantity * oi.unit_price) AS gross_revenue,
         SUM(oi.quantity) AS units
  FROM order_items oi
  GROUP BY oi.order_id
)
SELECT
  o.order_id,
  o.customer_id,
  c.first_name || ' ' || c.last_name AS customer_name,
  CAST(o.order_date AS date) AS order_date,
  o.status,
  COALESCE(it.gross_revenue, 0)::numeric(18,2) AS gross_revenue,
  COALESCE(it.units, 0) AS units,
  CASE WHEN o.status = 'refunded' THEN true ELSE false END AS is_refunded
FROM orders o
LEFT JOIN item_totals it ON it.order_id = o.order_id
LEFT JOIN customers c ON c.customer_id = o.customer_id;
  • Why it works: explicit grain, no SELECT *, types are cast, nulls handled, joins clear.
Example 2 — Pre-aggregated view for dashboards (daily sales by product)
-- Grain: one row per day per product
CREATE OR REPLACE VIEW analytics.daily_sales_by_product AS
SELECT
  CAST(o.order_date AS date) AS day,
  oi.product_id,
  p.product_name,
  p.category,
  SUM(oi.quantity) AS units,
  SUM(oi.quantity * oi.unit_price)::numeric(18,2) AS revenue,
  (SUM(oi.quantity * oi.unit_price) / NULLIF(SUM(oi.quantity), 0))::numeric(18,2) AS avg_selling_price
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status IN ('paid','shipped','delivered')
GROUP BY 1,2,3,4;
  • Why it works: aligns with dashboard grain, minimizes downstream aggregations, safe division with NULLIF.
Example 3 — Security-scoped view (region-level filtering)
-- Suppose table: user_region_map(username, region)
-- Grants ensure users see only their region.
CREATE OR REPLACE VIEW analytics.sales_orders_region_scoped AS
SELECT so.*
FROM analytics.sales_orders so
JOIN user_region_map m ON m.region = so.status_region  -- if region is present; else join via customers
WHERE m.username = CURRENT_USER;

If your base data lacks region, derive it in upstream views (e.g., from customers.region) and expose as status_region or customer_region to filter here.

Performance and refresh choices

  • Normal view: recalculated on each query; great for light logic and freshest data.
  • Materialized view: stores results; faster reads; must be refreshed (manually or scheduled by your platform).
  • Pre-aggregation: reduce row counts for dashboards (daily, weekly, product, region). Avoid aggregating again and again in BI tools.
  • Indexes: on join keys of base tables; for materialized views (where supported), index commonly filtered columns.
  • Partitioning/clustering (warehouse-specific): speeds scans on date-range queries significantly.
Materialized view pattern (PostgreSQL)
CREATE MATERIALIZED VIEW analytics.mv_daily_sales AS
SELECT CAST(o.order_date AS date) AS day,
       p.category,
       SUM(oi.quantity) AS units,
       SUM(oi.quantity * oi.unit_price)::numeric(18,2) AS revenue
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status IN ('paid','shipped','delivered')
GROUP BY 1,2;

-- After loads
REFRESH MATERIALIZED VIEW CONCURRENTLY analytics.mv_daily_sales;

-- Optional indexes for filters
CREATE INDEX ON analytics.mv_daily_sales(day);
CREATE INDEX ON analytics.mv_daily_sales(category);

Hands-on exercises

Do these locally or in your SQL sandbox. Keep the schema design checklist open as you work.

  • Reminder: The quick test is available to everyone. Only logged-in users will have their progress saved.

Exercise 1 (ex1) — Build a clean order view

Create analytics.orders_fact with grain one row per order, including: order_id, order_date (date), customer_id, customer_name, units, gross_revenue (numeric 18,2), is_refunded (boolean). Use orders, order_items, customers. Treat refunded orders as refunded regardless of items. Ensure null-safe sums and explicit casts.

Exercise 2 (ex2) — Fix anti-patterns

Given this flawed view, rewrite it to be BI-friendly.

CREATE OR REPLACE VIEW analytics.bad_sales AS
SELECT *
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
JOIN customers c ON c.customer_id = o.customer_id
WHERE order_date > NOW() - INTERVAL '30 days';
  • Problems to fix: SELECT *, volatile NOW() filter inside the view, double-counting orders, missing types, unclear grain.

Exercise submission checklist

  • Clear grain specified in a header comment.
  • No SELECT *; columns expressly listed.
  • Explicit types for dates and numerics.
  • Joins on correct keys; no unintended duplication.
  • No volatile time filters inside the view.
  • Meaningful, stable column names.

Common mistakes and how to self-check

  • Using SELECT *: tomorrow’s schema change may break dashboards. Self-check: list columns intentionally.
  • Volatile filters in view: e.g., WHERE order_date = current_date. Self-check: push time filters to BI queries; keep views reusable.
  • Double counting after joins: many-to-many joins inflate metrics. Self-check: verify expected row counts and totals against source-of-truth reports.
  • Implicit casting or text metrics: leads to type confusion. Self-check: CAST fields to date/numeric/boolean as needed.
  • Hiding null semantics: aggressive COALESCE can mask data issues. Self-check: only COALESCE when the business meaning is truly zero/false/empty.
  • Over-aggregating in BI tools: doing GROUP BY again on pre-aggregated views can distort results. Self-check: choose either raw-grain or aggregated view and stick to it.

Practical projects

  • Project 1: Publish a sales mart. Create three views: sales_orders (order grain), daily_sales_by_product (day x product), and top_customers_mtd (month-to-date). Document each with grain, columns, and refresh guidance.
  • Project 2: Row-level security demo. Build a region-scoped view and a tiny user-to-region map. Validate visibility with two test users.
  • Project 3: Performance before/after. Time a heavy dashboard against a normal view, then switch to a materialized pre-aggregation. Record speedup.

Who this is for

  • BI Developers who publish datasets to BI tools.
  • Analytics Engineers standardizing business logic.
  • Data Analysts who want reliable, fast, and reusable queries.

Prerequisites

  • Comfort with SELECT, JOIN, GROUP BY, and basic window functions.
  • Understanding of data types (date, numeric, boolean).
  • Basic knowledge of your warehouse features (materialized views, indexes, partitions).

Learning path

  1. Solidify SQL joins and aggregations.
  2. Learn semantic modeling basics (grain, dimensions, facts).
  3. Build clean subject-area views with stable schemas.
  4. Add pre-aggregations and, where supported, materialized views.
  5. Implement row-level scoping and permissions patterns.
  6. Measure and iterate on performance.

Next steps

  • Refactor one existing report to use a new, clean view.
  • Add data type casts and comments to your top 3 views.
  • Introduce one pre-aggregated or materialized view for your slowest dashboard.

Mini challenge

In one SQL file, create a daily revenue view that:

  • Aggregates by day and region.
  • Excludes non-revenue statuses.
  • Casts revenue to numeric(18,2).
  • Documents grain and business logic in a header comment.
Hint

Start by building a CTE that computes item totals, then join orders and customers for region, then GROUP BY day, region.

Take the quick test

Ready to check your understanding? Take the short test below. Everyone can take it. If you log in, your progress will be saved.

Practice Exercises

2 exercises to complete

Instructions

Create analytics.orders_fact with grain one row per order using tables orders, order_items, customers. Include columns:

  • order_id
  • order_date (date)
  • customer_id
  • customer_name
  • units (sum of quantities)
  • gross_revenue (sum of quantity * unit_price as numeric(18,2))
  • is_refunded (boolean, true if order status = 'refunded')

Ensure null-safe sums and explicit casts. Avoid SELECT * and volatile filters.

Expected Output
A view analytics.orders_fact that returns one row per order with the seven requested columns, correct types, and stable names.

Building Views For BI Consumption — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Building Views For BI Consumption?

AI Assistant

Ask questions about this tool