How to learn Group By And Having for SQL Basics in Business Analyst for free

Who this is for

Business Analysts who need to summarize data (sales by customer, tickets by agent, revenue by month).
Anyone comfortable with basic SELECT and WHERE who wants to produce clean, aggregated reports.

Prerequisites

Know SELECT, FROM, WHERE, ORDER BY.
Know basic aggregates: COUNT, SUM, AVG, MIN, MAX.

Learning path

Before: SELECT basics, WHERE filtering, aggregate functions.
Now: GROUP BY and HAVING.
Next: Joins with grouped data, CASE in aggregates, window functions (moving averages), and report formatting.

Why this matters

Real analyst tasks depend on grouping:

Monthly revenue by product category to track trends.
Customer segmentation: total orders and average order value per customer.
Operational dashboards: tickets closed per agent per week.
Anomaly checks: days with unusually high order counts.

Concept explained simply

GROUP BY turns row-level data into summary rows. Aggregates (SUM, COUNT, AVG, MIN, MAX) collapse many rows into one per group.

HAVING filters groups after they are formed. WHERE filters rows before grouping.

Mental model: buckets, then filters

Imagine you drop each row into a bucket labeled by your GROUP BY columns (e.g., one bucket per customer_id). Then you total or count inside each bucket. After that, HAVING keeps or removes whole buckets based on the aggregated results.

WHERE: “Which rows get into buckets?”
GROUP BY: “What defines each bucket?”
Aggregates: “What math inside each bucket?”
HAVING: “Which buckets survive?”

Key syntax

SELECT   group_col_1, group_col_2, AGG_FUNC(measure) AS metric
FROM     table
WHERE    row_filter
GROUP BY group_col_1, group_col_2
HAVING   aggregate_filter
ORDER BY metric DESC;

Every selected column must be either in GROUP BY or aggregated.
HAVING can use aggregates (e.g., HAVING SUM(amount) > 1000). WHERE cannot.
Filter raw rows (e.g., status = 'COMPLETED') in WHERE before grouping for correct results and performance.

Worked examples

Example 1: Total sales per customer and keep big spenders

Task: Show customers whose total completed sales in 2024 are at least 1,000.

SELECT   customer_id,
         SUM(amount) AS total_amount
FROM     orders
WHERE    status = 'COMPLETED'
  AND    created_at >= '2024-01-01'
  AND    created_at < '2025-01-01'
GROUP BY customer_id
HAVING   SUM(amount) >= 1000
ORDER BY total_amount DESC;

Result idea: one row per customer_id that meets the threshold, highest spend first.

Example 2: Daily order counts excluding cancelled

Task: Count orders per calendar day for non-cancelled orders.

SELECT   DATE(created_at) AS order_date,
         COUNT(*) AS order_count
FROM     orders
WHERE    status <> 'CANCELLED'
GROUP BY DATE(created_at)
ORDER BY order_date;

Key point: We exclude cancelled orders in WHERE before grouping.

Example 3: Average order value by region (ignore null regions)

Task: Average order amount for known customer regions, show regions with AVG > 50.

SELECT   region,
         AVG(amount) AS avg_order_value
FROM     orders
WHERE    region IS NOT NULL
GROUP BY region
HAVING   AVG(amount) > 50
ORDER BY avg_order_value DESC;

Tip: Use WHERE region IS NOT NULL to avoid a NULL group unless you intentionally want it.

Optional: Multi-column grouping

You can group by more than one column to get a finer grain.

SELECT   customer_id, DATE(created_at) AS order_date, COUNT(*) AS orders
FROM     orders
GROUP BY customer_id, DATE(created_at)
ORDER BY customer_id, order_date;

Practice: try it yourself

Use a table like orders(id, customer_id, amount, status, created_at, region).

Exercise 1 (mirrors ex1): Total completed 2024 sales per customer, keep customers with total_amount >= 2000, ordered by total_amount DESC.
Exercise 2 (mirrors ex2): Count orders per day excluding CANCELLED; show only days with at least 50 orders.

Checklist before you run:
- Did you put non-aggregated selected columns in GROUP BY?
- Did you use WHERE for row filters and HAVING for aggregate filters?
- Is your date range correctly bounded (start inclusive, end exclusive)?

Common mistakes and self-check

Selecting a column that is neither grouped nor aggregated. Self-check: Every non-aggregated select item should appear in GROUP BY.
Using WHERE with aggregates. Self-check: If your filter includes SUM/COUNT/AVG, it must be in HAVING.
Filtering after grouping when you meant before. Self-check: If the filter is about raw rows (status, date, region), it belongs in WHERE.
Grouping by expressions inconsistently. Self-check: If you select DATE(created_at), group by the same expression DATE(created_at).
Forgetting NULL handling. Self-check: Decide whether NULL should be excluded (WHERE col IS NOT NULL) or grouped as its own bucket.

Practical projects

Customer leaderboard: Build a top-20 customers report showing total sales, average order value, and order count for the past quarter. Add a HAVING clause to keep only customers with at least 3 orders.
Operations heatmap: Count tickets resolved per agent per weekday, highlighting agents with AVG resolution time below a threshold (requires a table with duration fields).
Daily anomaly watch: For each day, count orders and flag days above a configurable cutoff using HAVING COUNT(*) >= threshold. Export as a clean table for dashboarding.

Mini challenge

Dataset assumption: orders(id, customer_id, amount, status, created_at), customers(id, region).

Task: Find regions with at least 10 completed orders in Q2 2024 and an average order amount of 75 or more. Show region, order_count, avg_amount; sort by avg_amount DESC.

Show one possible solution

SELECT   c.region,
         COUNT(*) AS order_count,
         AVG(o.amount) AS avg_amount
FROM     orders o
JOIN     customers c ON c.id = o.customer_id
WHERE    o.status = 'COMPLETED'
  AND    o.created_at >= '2024-04-01'
  AND    o.created_at < '2024-07-01'
  AND    c.region IS NOT NULL
GROUP BY c.region
HAVING   COUNT(*) >= 10
   AND   AVG(o.amount) >= 75
ORDER BY avg_amount DESC;

Next steps

Try the exercises below, then take the Quick Test at the end of this page.
Apply GROUP BY in a real dataset you use at work or study.
Move on to joining grouped results with dimension tables (e.g., products, customers).

Progress note: The Quick Test is available to everyone. Only logged-in users will have their progress saved.

Menu

Group By And Having

Table of Contents

Who this is for

Prerequisites

Learning path

Why this matters

Concept explained simply

Key syntax

Worked examples

Practice: try it yourself

Common mistakes and self-check

Practical projects

Mini challenge

Next steps

Practice Exercises

Customer totals in 2024 with HAVING filter

Instructions

Expected Output

Busy days: daily counts excluding cancellations

Group By And Having — Quick Test

Have questions about Group By And Having?

AI Assistant