luvv to helpDiscover the Best Free Online Tools
Topic 23 of 31

Query Optimization Explain Analyze Execution Plans

Learn Query Optimization Explain Analyze Execution Plans for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you often diagnose slow dashboards, build reports over large tables, and help product teams answer questions fast. Being able to read and act on EXPLAIN/EXPLAIN ANALYZE execution plans lets you find the real bottleneck (scan, join, sort, or I/O) and fix it with the right index or query rewrite.

  • Speed up a KPI query that times out before stand-up.
  • Make a daily report run in minutes instead of hours.
  • Prevent cost explosions by avoiding unnecessary full table scans.

Concept explained simply

An execution plan is the database's blueprint for running your query. EXPLAIN shows the plan it intends to use. EXPLAIN ANALYZE actually runs the query and adds real timings, rows processed, and I/O. You read the plan from top to bottom, focusing on the slowest or largest steps.

Key terms (quick reference)
  • Seq Scan: Reads the whole table row by row.
  • Index Scan/Seek: Uses an index to jump to relevant rows.
  • Bitmap Index Scan + Bitmap Heap: Efficient for many scattered matches.
  • Nested Loop Join: Good for small-to-large with selective inner access.
  • Hash Join: Good for large equality joins; builds a hash table.
  • Merge Join: Good when both inputs are pre-sorted on join keys.
  • Cost: Planner’s estimate; not time. Helps compare alternative plans.
  • Rows/Actual Rows: Estimated vs real row counts. Big mismatches = statistics or predicate issues.
  • Buffers/Shared Hit/Read: I/O indicators. High reads = disk work.

Mental model

Imagine a factory line. Each node in the plan is a station: scan, filter, join, sort, aggregate. Data flows up from the leaves (table scans) to the root (final result). If one station is slow or sends too much material, everything slows. You fix it by:

  • Reducing input rows earlier (selective WHERE, better indexes).
  • Choosing the right join method (hash vs nested loop vs merge).
  • Avoiding wide sorts or re-sorts (indexes aligned to ORDER BY).

Worked examples

Assume typical analytics tables:

Sample schema (for reference)
-- tables
authors(id PK, name, country)
customers(id PK, country)
orders(id PK, customer_id FK, status, created_at, total_amount)
order_items(id PK, order_id FK, product_id FK, qty)
products(id PK, category, price)

Example 1 — Filter turning a Seq Scan into an Index Scan

Query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, total_amount
FROM orders
WHERE status = 'shipped' AND created_at >= DATE '2023-01-01';

Typical before-plan symptoms:

  • Seq Scan on orders
  • Actual rows processed is large
  • Buffers show many reads

Fix:

CREATE INDEX IF NOT EXISTS idx_orders_status_created
ON orders(status, created_at);

After re-run:

  • Index Scan using idx_orders_status_created
  • Far fewer rows read; time drops significantly
  • Less I/O in BUFFERS
Why this works

The composite index lets the database jump to only the 'shipped' range and then navigate by created_at, instead of reading the entire table.

Example 2 — Join choice and selective access

Query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT c.country, COUNT(*) AS orders_count
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE o.created_at >= DATE '2023-01-01'
GROUP BY c.country
ORDER BY orders_count DESC
LIMIT 5;

Before-plan:

  • Seq Scan on orders (filter on created_at not selective enough without index)
  • Hash Join (customers, orders)
  • Large HashAggregate and Sort

Fixes to test:

  • Create index to accelerate the filter and join: CREATE INDEX IF NOT EXISTS idx_orders_customer_created ON orders(customer_id, created_at);
  • Ensure customers(id) is the primary key (already indexed).

After-plan:

  • Bitmap/Index Scan on orders using idx_orders_customer_created
  • Hash Join remains but processes fewer rows
  • Less work in aggregate and sort, faster LIMIT
Why this works

The index pushes down filtering on created_at and speeds joining on customer_id, shrinking intermediate data.

Example 3 — Costly sort eliminated by index

Query:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, created_at
FROM orders
WHERE status = 'delivered'
ORDER BY created_at DESC
LIMIT 20;

Before-plan:

  • Index or Seq Scan + Sort (top-N sort still expensive on large match sets)

Fix:

CREATE INDEX IF NOT EXISTS idx_orders_status_created_desc
ON orders(status, created_at DESC);

After-plan:

  • Index Scan using idx_orders_status_created_desc
  • No explicit Sort node; returns top 20 directly
Why this works

An index that matches both the filter and the ORDER BY lets the database stream the first rows in order without sorting.

How to read an execution plan (step-by-step)

  1. Start at the top node: Identify the overall operation (Aggregate, Sort, Limit).
  2. Drill into the slowest child: Check its actual time and rows.
  3. Compare estimates vs actuals: Big gaps mean outdated statistics or non-selective predicates.
  4. Classify the bottleneck:
    • Scan issue: missing/ineffective index.
    • Join issue: poor join order/type or missing index on join keys.
    • Sort/Group issue: missing supporting index or huge intermediate set.
  5. Fix and re-measure: Add index, rewrite query, or adjust predicate. Re-run EXPLAIN (ANALYZE, BUFFERS).
Safety note

EXPLAIN ANALYZE executes the query. For write operations, use a transaction you can roll back or use EXPLAIN without ANALYZE to avoid changing data.

Exercises (hands-on)

These mirror the exercises below so you can track your progress.

Exercise 1 — Turn a full scan into a targeted index scan

  1. Run EXPLAIN (ANALYZE, BUFFERS) on:
    SELECT id, total_amount FROM orders
    WHERE status = 'shipped'
      AND created_at >= DATE '2023-01-01';
    
  2. Note the scan type, actual rows, and BUFFERS reads.
  3. Create an index: CREATE INDEX IF NOT EXISTS idx_orders_status_created ON orders(status, created_at);
  4. Re-run and compare plan shape and timing.
  • Checklist:
    • Before: Seq Scan confirmed
    • After: Index Scan/Bitmap Scan confirmed
    • Total time decreased
    • Buffers read decreased

Exercise 2 — Speed up a join + aggregation

  1. Run EXPLAIN (ANALYZE, BUFFERS):
    SELECT c.country, COUNT(*) AS orders_count
    FROM customers c
    JOIN orders o ON o.customer_id = c.id
    WHERE o.created_at >= DATE '2023-01-01'
    GROUP BY c.country
    ORDER BY orders_count DESC
    LIMIT 5;
    
  2. Identify join type and the heaviest node.
  3. Create index: CREATE INDEX IF NOT EXISTS idx_orders_customer_created ON orders(customer_id, created_at);
  4. Re-run and compare rows, time, and join behavior.
  • Checklist:
    • Join processes fewer rows
    • Aggregate and sort are faster
    • Total execution time reduced

Common mistakes and self-check

  • Using EXPLAIN ANALYZE on INSERT/UPDATE/DELETE without a safe rollback plan.
  • Building many single-column indexes instead of a well-ordered composite index.
  • Ignoring ORDER BY when designing indexes, causing large sorts.
  • Not refreshing statistics; estimates drift and plans degrade.
  • Optimizing the wrong step; always fix the slowest or largest node first.
Self-check prompts
  • Did the slowest node change after your fix?
  • Did actual rows drop earlier in the plan?
  • Did BUFFERS read decrease?
  • Does the index match WHERE and ORDER BY?

Practical projects

  • Make a dashboard fast: pick 2–3 slow charts, capture current plans, add the minimum indexes, and document the before/after timings.
  • Top-N report accelerator: for a report with ORDER BY + LIMIT, design an index that removes the Sort node.
  • Plan diary: log weekly EXPLAIN ANALYZE snapshots of a critical query to catch regressions early.

Who this is for and prerequisites

  • Who: Data Analysts who run complex queries, build dashboards, or support product analytics.
  • Prerequisites: Comfortable with SQL SELECT/JOIN/GROUP BY; basic understanding of indexes; ability to run queries in your SQL environment.

Learning path

  1. Read plans with EXPLAIN (costs, rows, nodes).
  2. Measure reality with EXPLAIN ANALYZE (timing, BUFFERS).
  3. Fix scans: design composite and partial indexes.
  4. Fix joins: choose supporting indexes and join order.
  5. Fix sorts: align indexes with ORDER BY + LIMIT.
  6. Rinse and repeat: change one thing, re-measure.

Next steps

  • Apply Example 1–3 patterns to your top three slow queries.
  • Set a rule: every new report includes an EXPLAIN ANALYZE check.
  • Take the quick test below to validate your understanding. Everyone can take it; only logged-in users get saved progress.

Mini challenge

You see: Seq Scan on orders, WHERE status = 'processing' AND created_at >= DATE '2024-01-01', plus a Sort on created_at DESC with LIMIT 10. Propose a single index that addresses both the filter and the order. Then explain how you would verify the improvement using EXPLAIN (ANALYZE, BUFFERS).

Practice Exercises

2 exercises to complete

Instructions

  1. Run EXPLAIN (ANALYZE, BUFFERS) on:
    SELECT id, total_amount FROM orders
    WHERE status = 'shipped'
      AND created_at >= DATE '2023-01-01';
    
  2. Record the scan type, actual rows, total time, and BUFFERS.
  3. Create index:
    CREATE INDEX IF NOT EXISTS idx_orders_status_created
    ON orders(status, created_at);
    
  4. Re-run EXPLAIN (ANALYZE, BUFFERS). Compare plan shape and timing.
Expected Output
Before: Seq Scan on orders with high rows and buffer reads. After: Index Scan or Bitmap Heap Scan using idx_orders_status_created; actual rows and total time significantly reduced.

Query Optimization Explain Analyze Execution Plans — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Query Optimization Explain Analyze Execution Plans?

AI Assistant

Ask questions about this tool