luvv to helpDiscover the Best Free Online Tools
Topic 20 of 31

Indexes B Tree Hash Composite Index Covering Index

Learn Indexes B Tree Hash Composite Index Covering Index for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you often run queries on large tables: finding recent orders, filtering by customer segments, building dashboards, and preparing A/B reports. The difference between a 50 ms query and a 50 s query is usually the right index. Knowing when to use B-tree, Hash, Composite, and Covering indexes helps you:

  • Speed up filters, joins, and ORDER BY/LIMIT queries.
  • Cut compute costs by avoiding full table scans.
  • Make dashboards responsive and ad-hoc analysis snappy.

Concept explained simply

An index is a compact structure that lets the database quickly find rows without scanning the whole table—like a book index. Different index types excel at different query patterns:

  • B-tree index: The default in many systems. Great for equality (=) and range (>, <, BETWEEN), and helps with ORDER BY on the indexed columns.
  • Hash index: Optimized for equality lookups only (e.g., WHERE token = ...). Not for range, sorting, or prefix searches. Availability and behavior vary by database engine.
  • Composite index: An index on multiple columns (e.g., (customer_id, order_date)). The leftmost (leading) column matters for how the index is used.
  • Covering index: An index that contains all columns needed by a query, so the database can answer from the index alone (index-only scan) without touching the table. In some systems you can add non-key columns to cover a query (e.g., INCLUDE).

Mental model

  • Seek vs Scan: Without an index, the database scans every row. With the right index, it seeks directly to matching entries, reads fewer pages, and returns results faster.
  • Selectivity: Indexes are most useful when a predicate filters to a small subset of rows (high selectivity).
  • Leading column rule: For composite indexes, the order of columns matters. The index is most effective when your WHERE and ORDER BY use the leftmost columns.
  • Trade-offs: Indexes accelerate reads but slow down writes (INSERT/UPDATE/DELETE) and use storage. Add only those that deliver real benefit.

Worked examples

Example 1 — B-tree index for range + sort

Task: Show the latest 20 orders for a given customer quickly.

-- Query
SELECT order_id, order_date, total_amount
FROM orders
WHERE customer_id = 123
  AND order_date >= DATE '2024-01-01'
ORDER BY order_date DESC
LIMIT 20;

-- Index (compliant with many RDBMS; DESC may be optional depending on engine)
CREATE INDEX idx_orders_customer_date ON orders (customer_id, order_date DESC);

Why this works: The composite B-tree lets the engine find the subset for customer_id, then read the date range in order, making ORDER BY + LIMIT efficient.

Example 2 — Hash index for equality-only lookups

Task: Fetch a user session by exact token.

-- Query
SELECT user_id, last_seen
FROM sessions
WHERE session_token = 'abc123';

-- Option A (engines that support hash indexes, e.g., PostgreSQL):
CREATE INDEX idx_sessions_token_hash ON sessions USING HASH (session_token);

-- Option B (portable fallback):
CREATE INDEX idx_sessions_token ON sessions (session_token);

Use a Hash index only for equality comparisons and if your engine benefits from it. Otherwise, a B-tree on the column is a safe choice.

Example 3 — Covering index for index-only scans

Task: A dashboard lists order_id and total_amount for a customer’s last 50 orders by date.

-- Query
SELECT order_id, total_amount
FROM orders
WHERE customer_id = 123
ORDER BY order_date DESC
LIMIT 50;

-- PostgreSQL (cover with INCLUDE)
CREATE INDEX idx_orders_cover ON orders (customer_id, order_date DESC) INCLUDE (total_amount);

-- MySQL/InnoDB (no INCLUDE keyword)
-- If the index has (customer_id, order_date, total_amount), the query may be covered
CREATE INDEX idx_orders_cover ON orders (customer_id, order_date, total_amount);

-- SQL Server (include non-key)
CREATE INDEX idx_orders_cover ON dbo.orders (customer_id, order_date DESC) INCLUDE (total_amount);

When fully covered, the engine can avoid visiting the base table, reducing I/O and latency.

Example 4 — Composite index order matters

Task: Two queries on events(user_id, occurred_at, event_type):

-- Q1: filter by user, then time range
SELECT * FROM events
WHERE user_id = 7 AND occurred_at BETWEEN NOW() - INTERVAL '7 days' AND NOW();

-- Q2: filter by time range only
SELECT count(*) FROM events
WHERE occurred_at >= NOW() - INTERVAL '1 day';

-- Index choice
CREATE INDEX idx_events_user_time ON events (user_id, occurred_at);

This index is great for Q1 (matches leading user_id then uses occurred_at). For Q2 (time-only filter), you may also want an index on occurred_at alone if it’s a frequent query.

How to choose the right index (quick steps)

1. Map the query pattern. Identify WHERE columns, join keys, and ORDER BY.
2. Pick index type. Range/sorting → B-tree. Equality-only hot lookups on a single column (and supported engine) → consider Hash.
3. Set column order. For composite indexes, put the most selective and most frequently filtered column first; preserve useful sort order if needed.
4. Consider covering. Add included columns (or append in engines without INCLUDE) so the index alone answers the query.
5. Validate. Use EXPLAIN/EXPLAIN ANALYZE to confirm index usage and estimate I/O reduction.

Exercises you can do now

These mirror the graded exercises below. Run on a realistic dataset if possible.

  1. Composite + covering. Speed up: SELECT order_id, total_amount FROM sales WHERE customer_id = ? AND order_date BETWEEN ? AND ? ORDER BY order_date DESC LIMIT 50; Create a composite B-tree and make it covering.
  2. Equality-heavy lookup. Speed up: SELECT user_id FROM api_keys WHERE api_key = ?; Try a Hash index if supported; otherwise a B-tree. Compare EXPLAIN before/after.
  • Checklist: Identify WHERE columns and sort order.
  • Checklist: Choose B-tree or Hash thoughtfully.
  • Checklist: Order composite columns by how the query filters/sorts.
  • Checklist: Cover the query to avoid table lookups when it helps.
  • Checklist: Verify with EXPLAIN and row/page reads.

Common mistakes and how to self-check

  • Indexing low-selectivity columns alone (e.g., status with 95% of rows the same). Self-check: Does the predicate filter to a small fraction?
  • Wrong composite order. Self-check: Do WHERE clauses use the leftmost columns? Does the ORDER BY align with the index order?
  • Overlapping/duplicate indexes. Self-check: Can one composite index serve multiple queries? Remove redundant ones.
  • Expecting Hash to help ranges or sorting. Self-check: Does the query use only equality? If not, prefer B-tree.
  • Ignoring write cost. Self-check: Are inserts/updates slowing down after adding indexes?
  • Not validating with EXPLAIN. Self-check: Confirm Index Seek/Index Scan and row estimates match expectations.
Self-check mini task

Pick 2 of your slowest queries. For each, list WHERE columns, joins, ORDER BY. Propose one index. Use EXPLAIN to confirm usage. If not used, ask: wrong order? low selectivity? already covered by another index?

Practical projects

  • Dashboard turbocharge: Take a BI dashboard with 3 slow widgets. Add/adjust indexes so all load under 500 ms. Document each index, expected benefit, and EXPLAIN before/after.
  • Query coverage audit: For a reporting schema, list top 10 queries and map them to existing indexes. Remove two redundant indexes safely after verifying no regressions.
  • A/B analysis accelerator: Create composite indexes that speed up filtering by experiment_id and event timestamps across two large tables; show join improvements.

Mini challenge

You have a query: SELECT order_id, order_date FROM orders WHERE store_id = 77 AND order_date >= CURRENT_DATE - INTERVAL '30 days' ORDER BY order_date DESC LIMIT 100;. Propose one index that helps both filtering and sorting, and say whether it could be covering if the SELECT also needs total_amount.

Suggested answer

CREATE INDEX idx_orders_store_date ON orders (store_id, order_date DESC); To cover total_amount, add INCLUDE (total_amount) where supported, or append total_amount in engines without INCLUDE.

Who this is for

  • Data Analysts who run frequent filters/aggregations on large tables.
  • Anyone tuning SQL for dashboards, reports, or ad-hoc analysis.

Prerequisites

  • Comfort with SELECT, WHERE, JOIN, GROUP BY, ORDER BY.
  • Know how to run EXPLAIN/EXPLAIN ANALYZE in your database.

Learning path

  1. Review slow queries and their predicates.
  2. Learn B-tree basics and the leading column rule.
  3. Practice composite indexes with ORDER BY + LIMIT.
  4. Explore covering indexes and measure index-only scans.
  5. Compare equality-only patterns with Hash vs B-tree.
  6. Clean up redundant indexes and measure write overhead.

Next steps

  • Do the exercises below and verify with EXPLAIN.
  • Take the quick test to check your understanding. Note: The quick test is available to everyone; only logged-in users get saved progress.
  • Apply indexes on a real dataset and record latency improvements.

Practice Exercises

2 exercises to complete

Instructions

Table: sales(order_id, customer_id, order_date, total_amount, status).

Goal: Speed up the query returning the latest 50 orders for a given customer in a date range.

SELECT order_id, total_amount
FROM sales
WHERE customer_id = :cid
  AND order_date BETWEEN :from AND :to
ORDER BY order_date DESC
LIMIT 50;

Tasks:

  • Create one index that supports the filter and ORDER BY.
  • Make the query covered so the engine can avoid touching the base table.
  • Verify with EXPLAIN that the index is used and that few pages are read.
Expected Output
EXPLAIN shows an index seek/scan on the composite index, using order_date ordering. The plan indicates an index-only scan (where supported) and returns up to 50 rows quickly.

Indexes B Tree Hash Composite Index Covering Index — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Indexes B Tree Hash Composite Index Covering Index?

AI Assistant

Ask questions about this tool