luvv to helpDiscover the Best Free Online Tools
Topic 15 of 31

Set Operations Union Intersect Except

Learn Set Operations Union Intersect Except for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you often combine results from multiple tables or time periods. Set operations let you quickly answer questions like: Which customers appeared in both campaigns? Which products were sold last year but not this year? How do I combine two lists without duplicates?

  • Deduplicate combined lists from different sources
  • Find overlaps between cohorts or periods
  • Identify drop-off or new entries between snapshots

Who this is for

Data Analysts and learners who know basic SELECT queries and want to compare and combine result sets cleanly.

Prerequisites

  • SELECT, WHERE, and ORDER BY basics
  • Comfort with simple joins
  • Understanding of column data types

Concept explained simply

Set operations work on the results of two or more SELECT statements. Think of each SELECT as producing a list of rows. Then:

  • UNION: combine lists and remove duplicates
  • UNION ALL: combine lists and keep duplicates
  • INTERSECT: return only rows that appear in both lists
  • EXCEPT (sometimes called MINUS): rows in the first list that do not appear in the second
Dialect notes
  • EXCEPT may be called MINUS in some databases
  • Some systems do not support INTERSECT/EXCEPT directly; you can emulate with joins (INNER JOIN for INTERSECT; LEFT JOIN with NULL filter for EXCEPT)

Mental model

Picture two circles of rows (lists) from two SELECTs:

  • UNION: the whole area covered by both circles, deduplicated
  • UNION ALL: the whole area, keeping duplicates
  • INTERSECT: the overlapping area
  • EXCEPT: the part of the first circle that does not overlap with the second

Syntax quick view

-- Union (distinct by default)
SELECT col1, col2 FROM table_a
UNION
SELECT col1, col2 FROM table_b
ORDER BY col1;

-- Union all (keeps duplicates)
SELECT col1, col2 FROM table_a
UNION ALL
SELECT col1, col2 FROM table_b;

-- Intersect
SELECT col1, col2 FROM table_a
INTERSECT
SELECT col1, col2 FROM table_b;

-- Except (or MINUS)
SELECT col1, col2 FROM table_a
EXCEPT
SELECT col1, col2 FROM table_b;
Core rules (worth memorizing)
  • Each SELECT must return the same number of columns, in the same order
  • Corresponding columns must have compatible data types
  • Column names in the final result come from the first SELECT
  • ORDER BY can appear only once, at the very end
  • NULLs are considered equal when removing duplicates in set operations
  • Use parentheses to control precedence when chaining multiple set operations

Worked examples

Example 1: Combine two months of active users

-- Tables: active_users_jan(user_id), active_users_feb(user_id)
-- Unique users across both months
SELECT user_id FROM active_users_jan
UNION
SELECT user_id FROM active_users_feb
ORDER BY user_id;

-- If you need to count total appearances (including duplicates)
SELECT user_id FROM active_users_jan
UNION ALL
SELECT user_id FROM active_users_feb;

UNION returns each user once; UNION ALL shows duplicate rows if a user appears in both months.

Example 2: Customers who purchased in both years

-- Tables: purchases_2023(customer_id), purchases_2024(customer_id)
-- Customers present in both sets
SELECT customer_id FROM purchases_2023
INTERSECT
SELECT customer_id FROM purchases_2024
ORDER BY customer_id;

-- Emulation if INTERSECT is unavailable
SELECT p23.customer_id
FROM (SELECT DISTINCT customer_id FROM purchases_2023) p23
INNER JOIN (SELECT DISTINCT customer_id FROM purchases_2024) p24
  ON p23.customer_id = p24.customer_id
ORDER BY p23.customer_id;

INTERSECT (or the join) returns only customer IDs found in both years.

Example 3: Products discontinued this year

-- Tables: products_2023(product_sku), products_2024(product_sku)
-- In 2023 but not in 2024
SELECT product_sku FROM products_2023
EXCEPT
SELECT product_sku FROM products_2024
ORDER BY product_sku;

-- Emulation if EXCEPT is unavailable
SELECT p23.product_sku
FROM (SELECT DISTINCT product_sku FROM products_2023) p23
LEFT JOIN (SELECT DISTINCT product_sku FROM products_2024) p24
  ON p23.product_sku = p24.product_sku
WHERE p24.product_sku IS NULL
ORDER BY p23.product_sku;

EXCEPT returns rows unique to the first result set.

Hands-on exercises

These mirror the exercises listed below and include solutions you can reveal. Try them first before opening the solutions.

Exercise 1: All events across two years

Tables:

events_2024(event_name)
Spring Expo
Data Day
Retail Summit

events_2025(event_name)
Health Forum
Data Day
Retail Summit

Task: Return a single column event_name containing every unique event across both years, sorted alphabetically.

  • Use a set operation
  • No duplicates
  • One ORDER BY at the end

Exercise 2: Emails in both signup and purchase lists

Tables:

signups(email)
ana@ex.com
bo@ex.com
cy@ex.com

purchases(email)
bo@ex.com
di@ex.com
cy@ex.com

Task: Return the emails that are present in both tables, sorted alphabetically. If INTERSECT is unavailable, emulate it.

Self-check checklist
  • Did both SELECTs return the same number of columns in the same order?
  • Did you avoid placing ORDER BY inside subqueries of the set operation?
  • Did you apply ORDER BY once at the very end?
  • If emulating INTERSECT/EXCEPT, did you DISTINCT before joining to avoid accidental duplicates?

Common mistakes

  • Different column counts or incompatible types between SELECTs: align columns and cast if needed
  • Using ORDER BY in each SELECT: move ORDER BY to the final combined query
  • Expecting UNION to keep duplicates: use UNION ALL if you need duplicates
  • Forgetting that column names come from the first SELECT: alias columns in the first SELECT
  • Unexpected duplicates with INTERSECT/EXCEPT emulations: use DISTINCT before joins
Quick self-debug steps
  • Run each SELECT separately to confirm schemas and data types
  • COUNT rows before and after to confirm deduplication behavior
  • Test with tiny sample data to validate logic

Practical projects

  • Cohort overlap dashboard: Given monthly user cohorts, use INTERSECT (or joins) to compute overlaps and trend them
  • Catalog change report: Use EXCEPT to list added and dropped SKUs between snapshot tables
  • Campaign reach merger: Use UNION/UNION ALL to combine multi-channel contact lists and deduplicate appropriately

Learning path

  • Before: SELECT, WHERE, GROUP BY, basic joins
  • This subskill: UNION, UNION ALL, INTERSECT, EXCEPT/MINUS, emulations with joins
  • After: Subqueries, window functions, and CTEs for more complex analysis

Mini challenge

You have two tables, newsletter_may(email) and newsletter_june(email). Return three result sets:

  • All unique emails across both months
  • Emails present in both months
  • Emails in May but not in June

Write three queries using set operations. Then re-write the last two using joins only.

Next steps

  • Revisit your recent analyses and replace manual joins with set operations where appropriate
  • Create small sample tables and verify each operation’s behavior with duplicates and NULLs
  • Proceed to advanced querying with subqueries and CTEs

Quick Test

Take the quick test below to check your understanding. Everyone can take it for free; logged-in learners will have their progress saved automatically.

Practice Exercises

2 exercises to complete

Instructions

Tables and sample rows:

events_2024(event_name)
Spring Expo
Data Day
Retail Summit

events_2025(event_name)
Health Forum
Data Day
Retail Summit

Task: Return a single column event_name containing every unique event from both tables, sorted alphabetically. Use a set operation (not a join).

Expected Output
Data Day, Health Forum, Retail Summit, Spring Expo

Set Operations Union Intersect Except — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Set Operations Union Intersect Except?

AI Assistant

Ask questions about this tool