luvv to helpDiscover the Best Free Online Tools
Topic 16 of 31

Subqueries Correlated Subqueries Uncorrelated Subqueries

Learn Subqueries Correlated Subqueries Uncorrelated Subqueries for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you will often need to filter results based on other results, compare a value to group statistics, or check if related records exist. Subqueries let you do this cleanly without creating many temporary tables. Knowing when to use uncorrelated vs correlated subqueries helps you write readable, efficient SQL for reports, dashboards, cohort analyses, anomaly checks, and quality assurance.

  • Find products priced above the overall average price
  • List customers who have no orders in the last 90 days
  • Flag orders where line items do not sum to the order total

Concept explained simply

A subquery is a query inside another query. It can return a single value, a list of values, a table, or a boolean (when used with EXISTS).

  • Uncorrelated subquery: Runs once, independent of the outer query. Think: a helper result you compute first, then use.
  • Correlated subquery: Runs for each row of the outer query, because it references that row. Think: ask a question about “this row’s related rows.”

Mental model

Picture the outer query as a list you scan. For each row:

  • Uncorrelated: you reuse the same precomputed answer (like a single average price).
  • Correlated: you ask a row-specific question (e.g., “does this customer have an order in 2024?”).
Where subqueries can appear
  • SELECT: scalar (single value) subqueries per outer row
  • FROM: derived table (a subquery that acts like a table)
  • WHERE / HAVING: predicates using IN, EXISTS, ANY/SOME, ALL, comparisons
Portable date extraction note

Use EXTRACT(YEAR FROM order_date)=2024 for ANSI SQL. Some systems also support YEAR(order_date)=2024.

Worked examples

1) Uncorrelated scalar subquery in WHERE

Goal: List products priced above the overall average price.

SELECT p.product_id, p.product_name, p.price
FROM products p
WHERE p.price > (
  SELECT AVG(price) FROM products
);

Why it works: The inner AVG runs once. The result is compared for each product.

2) Correlated subquery with EXISTS

Goal: Customers who placed at least one order in 2024.

SELECT c.customer_id, c.customer_name
FROM customers c
WHERE EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.customer_id
    AND EXTRACT(YEAR FROM o.order_date) = 2024
);

Why it works: The inner query references c.customer_id (outer row), so it runs per customer.

3) Anti-join with NOT EXISTS (customers with no 2024 orders)

SELECT c.customer_id, c.customer_name
FROM customers c
WHERE NOT EXISTS (
  SELECT 1
  FROM orders o
  WHERE o.customer_id = c.customer_id
    AND EXTRACT(YEAR FROM o.order_date) = 2024
);

Why it works: NOT EXISTS filters out customers who have a matching 2024 order.

4) Subquery in FROM (derived table)

Goal: Top 5 customers by 2024 spend using a derived table.

SELECT t.customer_id, t.total_spend_2024
FROM (
  SELECT o.customer_id, SUM(o.total_amount) AS total_spend_2024
  FROM orders o
  WHERE EXTRACT(YEAR FROM o.order_date) = 2024
  GROUP BY o.customer_id
) t
ORDER BY t.total_spend_2024 DESC
FETCH FIRST 5 ROWS ONLY;

Why it works: The inner query aggregates once; the outer query ranks and limits.

5) Scalar correlated subquery in SELECT

Goal: For each customer, show their most recent order date.

SELECT c.customer_id,
       c.customer_name,
       (
         SELECT MAX(o.order_date)
         FROM orders o
         WHERE o.customer_id = c.customer_id
       ) AS last_order_date
FROM customers c;

Why it works: The inner query returns a single value per customer row.

When to use what

  • Use EXISTS/NOT EXISTS for presence/absence checks. It short-circuits on first match and avoids duplicates.
  • Use IN when comparing a value to a small list from a subquery.
  • Use JOINs to bring columns from related tables. Use subqueries when you only need to filter or compute a single value.
  • Use derived tables (FROM) to stage complex aggregations for clarity.

Performance tips

  • Index join keys used in correlated subqueries (e.g., orders.customer_id).
  • Prefer EXISTS to IN when the inner set can be large or non-unique.
  • Ensure scalar subqueries really return one value; add LIMIT or aggregate if needed.
  • Consider rewriting heavy correlated subqueries as JOIN + GROUP BY if they run slowly.

Exercises (do these now)

These mirror the exercises provided below. Try to solve without looking at the solutions. Then check your work.

Exercise 1 — Orders above average total (uncorrelated)

Return orders where total_amount is greater than the average total_amount of all orders.

  • Expected columns: order_id, total_amount
  • Constraint: Use a subquery in WHERE
Show a small hint

Compare total_amount to a single AVG() computed once in a subquery.

Exercise 2 — Customers with 2+ orders in 2024 (correlated)

Return customers who placed at least two orders in 2024.

  • Expected columns: customer_id, customer_name
  • Constraint: Use a correlated subquery (scalar or EXISTS) in WHERE
Show a small hint

A scalar subquery can compute COUNT(*) per customer_id for 2024.

Exercise 3 — Products never ordered (anti-join with NOT EXISTS)

Return products that have no matching rows in order_items.

  • Expected columns: product_id, product_name
  • Constraint: Use NOT EXISTS
Show a small hint

Correlate order_items.product_id to products.product_id.

Self-check checklist

  • Your queries run without errors.
  • Exercise 1 uses one AVG() subquery, not a join.
  • Exercise 2 references the outer customer in the inner WHERE clause.
  • Exercise 3 uses NOT EXISTS and correlates on product_id.

Common mistakes and how to self-check

  • Mistake: Using a scalar subquery that returns multiple rows. Fix: Aggregate (e.g., MAX) or filter to one row.
  • Mistake: IN with a large, duplicate set causing slowdowns. Fix: Switch to EXISTS or SELECT DISTINCT in the subquery.
  • Mistake: Forgetting to correlate (missing join keys) so the subquery runs once and yields wrong logic. Fix: Confirm inner WHERE references outer columns.
  • Mistake: Selecting columns from the inner subquery in WHERE. Fix: WHERE subqueries cannot output columns to the SELECT list; join instead.
  • Mistake: Comparing NULLs incorrectly. Fix: Use IS NULL/IS NOT NULL or COALESCE where appropriate.
Quick self-review
  • Can I explain why my subquery is correlated or uncorrelated?
  • If the subquery runs per row, are the key columns indexed?
  • Does the filter logic match the business question exactly?

Practical projects

  • Churn detector: List customers with no orders in the last 90 days using NOT EXISTS; email them a win-back offer list.
  • Price positioner: Flag products priced above category average (scalar subquery) and summarize counts per category.
  • High-value cohort: Build a derived table for 2024 spend per customer, then rank and segment into tiers.

Who this is for

  • Junior to intermediate Data Analysts who know basic SELECT/JOIN/GROUP BY.
  • BI developers or data-savvy PMs who write SQL for analytics.

Prerequisites

  • Comfort with SELECT, WHERE, JOIN, GROUP BY
  • Basic understanding of indexes and primary/foreign keys

Learning path

  1. Refresh JOINs and GROUP BY
  2. Learn uncorrelated subqueries (scalar and IN)
  3. Learn correlated subqueries (EXISTS/NOT EXISTS)
  4. Practice derived tables in FROM
  5. Optimize and rewrite between subqueries and joins

Next steps

  • Practice with real datasets (orders, products, customers).
  • Try rewriting your subqueries as joins and compare results and performance.
  • Take the Quick Test below to check mastery. Everyone can take it; saving progress requires login.

Mini challenge

Return products whose price is greater than the average price within their own category (category_id). Use a correlated subquery.

Show one possible answer
SELECT p.product_id, p.product_name, p.category_id, p.price
FROM products p
WHERE p.price > (
  SELECT AVG(p2.price)
  FROM products p2
  WHERE p2.category_id = p.category_id
);

Check: The inner subquery references p.category_id (correlated), and AVG runs per category.

Ready to test yourself? Open the Quick Test section. Estimated time: 5–7 minutes. Everyone can take it; only logged-in users get saved progress.

Practice Exercises

3 exercises to complete

Instructions

Return orders where total_amount is greater than the overall average total_amount.

  • Tables: orders(order_id, customer_id, order_date, total_amount)
  • Output: order_id, total_amount
  • Use an uncorrelated subquery in WHERE.
Expected Output
Rows of orders where total_amount > AVG(total_amount) across all orders. Columns: order_id, total_amount.

Subqueries Correlated Subqueries Uncorrelated Subqueries — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Subqueries Correlated Subqueries Uncorrelated Subqueries?

AI Assistant

Ask questions about this tool