Why this matters
As a BI Analyst, you’ll rank products and customers, track trends, and explain changes over time. Window functions let you do this without losing row-level detail. Typical tasks:
- Top-N products per category and per month.
- Running totals and moving averages for sales dashboards.
- Month-over-month (MoM) and week-over-week (WoW) growth.
- Customer churn and reactivation trends using LAG/LEAD.
Real-world snapshot
You receive a request: “Show top 3 products by revenue in each category last month, plus 7-day moving average.” This is a one-query job with window functions: RANK() for the top list and AVG() OVER for the trend.
Concept explained simply
A window function looks at a row and its neighbors to compute values like rank, running total, or previous value—without collapsing rows like GROUP BY.
Mental model
Imagine each row carries a small transparent “window” that slides across related rows (defined by PARTITION BY) in a certain order (ORDER BY). The function looks through that window to calculate a value for the current row.
Core syntax
-- Pattern
function_name(args) OVER (
PARTITION BY <grouping columns>
ORDER BY <sorting columns>
<optional frame: ROWS/RANGE BETWEEN ... AND ...>
)
- PARTITION BY resets calculations per group (e.g., per category or customer).
- ORDER BY defines sequence inside each partition.
- Frame (ROWS or RANGE) controls how many peer rows are considered for running totals or moving averages. Be explicit to avoid engine defaults.
Common window functions for this topic
- ROW_NUMBER(), RANK(), DENSE_RANK() for rankings.
- SUM(), AVG(), MIN(), MAX() with frames for running totals and moving averages.
- LAG(), LEAD() to compare current row to previous/next periods.
- FIRST_VALUE(), LAST_VALUE() for boundary values; use frames to get expected results.
Worked examples
1) Top 3 products per category (handle ties)
-- Table: sales(order_id, order_date, product_id, category, revenue)
WITH product_rev AS (
SELECT category, product_id, SUM(revenue) AS total_revenue
FROM sales
WHERE order_date >= DATE '2024-01-01' AND order_date < DATE '2025-01-01'
GROUP BY category, product_id
)
SELECT category, product_id, total_revenue,
DENSE_RANK() OVER (
PARTITION BY category
ORDER BY total_revenue DESC
) AS rnk
FROM product_rev
-- filter after ranking (some engines support QUALIFY)
WHERE DENSE_RANK() OVER (
PARTITION BY category ORDER BY total_revenue DESC
) <= 3;
Why DENSE_RANK and not ROW_NUMBER?
DENSE_RANK keeps ties together. If two products tie for rank 1, both appear, and the next rank is 2. ROW_NUMBER would arbitrarily pick one item for each rank position.
2) Running total and 7-day moving average
-- Table: daily_sales(sales_date, revenue)
SELECT
sales_date,
revenue,
SUM(revenue) OVER (
ORDER BY sales_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
AVG(revenue) OVER (
ORDER BY sales_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS ma_7d
FROM daily_sales
ORDER BY sales_date;
Tip on frames
Use ROWS for fixed row windows (e.g., 7 rows). RANGE depends on value ranges and may include peers with the same ORDER BY value; results differ between engines.
3) Month-over-month growth with LAG
-- Table: sales(order_date, customer_id, revenue)
WITH monthly AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(revenue) AS rev
FROM sales
GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
month,
rev,
LAG(rev) OVER (ORDER BY month) AS prev_rev,
(rev - LAG(rev) OVER (ORDER BY month)) AS abs_change,
CASE WHEN LAG(rev) OVER (ORDER BY month) = 0 THEN NULL
ELSE ROUND( (rev - LAG(rev) OVER (ORDER BY month)) * 100.0 / NULLIF(LAG(rev) OVER (ORDER BY month),0), 2)
END AS pct_change
FROM monthly
ORDER BY month;
Nulls and division
Guard against division by zero. Use NULLIF to avoid errors. If previous month is null (first month), pct_change will be null.
4) Latest value per customer with LAST_VALUE
-- Table: customer_health(customer_id, snapshot_date, score)
SELECT
customer_id,
snapshot_date,
score,
LAST_VALUE(score) OVER (
PARTITION BY customer_id
ORDER BY snapshot_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS latest_score
FROM customer_health;
Why specify the frame?
Without a full frame, LAST_VALUE may return the current row’s score. Explicitly set the frame to include all rows to get the latest score for each partition.
Exercises
Work from the same data model assumptions as the examples. If your SQL engine differs, adjust date functions or use subqueries where QUALIFY is not available.
For each category, list the top 2 products by total revenue in 2024. Break ties with DENSE_RANK. Show: category, product_id, total_revenue, rnk. Include only rnk ≤ 2.
Compute monthly revenue per customer and show: customer_id, month, revenue, prev_month_revenue, abs_change, pct_change. Use LAG to compare months.
Hints
- Aggregate first, then rank/filter rows after the window function.
- Prefer ROWS frames for moving averages; be explicit with frames for LAST_VALUE.
- Use DATE_TRUNC('month', order_date) or equivalent in your SQL dialect.
Self-check checklist
- I used PARTITION BY where a reset per group is needed.
- I was explicit about ROWS frames for running totals and moving averages.
- I filtered after applying rank (via subquery or QUALIFY).
- I guarded against division by zero in percent change.
- I verified tie behavior with DENSE_RANK vs RANK vs ROW_NUMBER.
Common mistakes and how to self-check
- Forgetting PARTITION BY: Running totals span the entire table. Self-check: Does the total reset for each category or customer?
- Using ROW_NUMBER for tied rankings: Ties are silently broken. Self-check: If two items have equal totals, do both appear?
- Implicit frames: Different engines default differently. Self-check: Add ROWS BETWEEN ... to make intent explicit.
- Filtering before ranking: You might rank a subset incorrectly. Self-check: Rank in a subquery, then filter rnk ≤ N in the outer query.
- LAST_VALUE without full-frame: Returns current row instead of the true latest. Self-check: Include ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
Dialect tips
- QUALIFY is available in some engines (e.g., BigQuery, Snowflake). Otherwise, use a subquery or CTE to filter on window results.
- Date functions vary (DATE_TRUNC, TRUNC, DATEPART). Use equivalents in your engine.
Practical projects
- Category leaderboard: Monthly top 5 products with ties, plus each product’s 7-day moving average revenue.
- Customer health pulse: Latest NPS/usage score per customer and MoM change using LAG and LAST_VALUE.
- Marketing cohort trends: Weekly running total of conversions and WoW percent change per channel.
Who this is for
BI Analysts and aspiring analysts who need to build ranking lists and trend views directly in SQL for dashboards and reports.
Prerequisites
- Comfort with SELECT, WHERE, GROUP BY, and basic aggregates.
- Familiarity with date/time truncation to month or week.
Learning path
- Now: Window functions for rankings and trends.
- Next: Advanced frames (range vs rows), percentiles (NTILE, PERCENT_RANK), and cumulative distinct counts (approximate methods depending on engine).
- Then: Performance tuning with partitions, indexes, and materialized views.
Mini challenge
In the last 30 days, find the top product by revenue in each category. Break ties with highest average daily revenue. Then compute WoW growth for that product’s revenue. Use RANK/DENSE_RANK, AVG OVER, and LAG.
Next steps
Practice the exercises, then take the Quick Test below to check your understanding. Note: The test is available to everyone; only logged-in users get saved progress.