Why this matters
Efficient SQL keeps backends fast, cheap, and reliable. As a Backend Engineer, you will:
- Speed up slow endpoints by reducing rows scanned and avoiding full table scans.
- Build dashboards and reports that run in milliseconds, not minutes.
- Design indexes that match real queries.
- Implement pagination and filtering that scale to millions of rows.
- Eliminate N+1 query patterns that overload databases.
Real-world scenario
Your service lists recent orders with filters by status and date. The query is slow and timing out during peak traffic. You must rewrite it and add the right index without changing business behavior, and ship safely.
Who this is for
- Backend Engineers who write or review SQL in services, jobs, or reports.
- Developers moving from ORMs to hand-written SQL for performance work.
- Engineers preparing for system design or performance interviews.
Prerequisites
- Basic SQL: SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT.
- Know your DB engine basics (e.g., PostgreSQL/MySQL/SQL Server). Syntax may vary; concepts transfer.
- Ability to read an execution plan at a high level (scan vs seek, join type).
Concept explained simply
Fast queries do less work. They touch fewer pages on disk, compare fewer rows, and return only what you need. The big levers:
- Sargable predicates: Write WHERE conditions that let the database use an index (no functions on indexed columns).
- Right indexes: Match your most selective filters and sort order. Consider covering indexes (include/select only needed columns).
- Join strategy: Join on indexed keys, filter early, select fewer columns.
- Pagination: Use keyset (seek) pagination instead of large OFFSETs.
- Aggregation: Pre-filter before GROUP BY; aggregate on smaller sets.
Mental model
Imagine your data as a big ordered phone book (B-Tree). You want to jump to the right page (index seek) instead of reading every page (full scan).
- Seek: WHERE last_name = 'Ng' uses the phone book index to jump directly.
- Scan: WHERE LOWER(last_name) = 'ng' forces reading all rows to compute LOWER().
- Covering index: The phone book already has the number you need; no extra lookup.
- Joins: Nested loop joins are fast when the inner side is indexed on the join key.
- Sorting: If data is already in the order you need (by index), ORDER BY is cheap.
Execution plan cues to watch
- Good signs: Index Seek, Index Only Scan, Rows=low, Filtered early.
- Bad signs: Seq/Full Table Scan, Filesort/Sort on large sets, Rows=very high, Function on column in WHERE.
Worked examples
Example 1 — Sargable filtering
Goal: Find orders created in the last 7 days by status.
Bad (non-sargable)
SELECT id, customer_id, status, created_at
FROM orders
WHERE DATE(created_at) >= CURRENT_DATE - INTERVAL '7 days'
AND status = 'paid';
Problem: DATE(created_at) forces a function on the column, blocking index usage.
Good (sargable)
SELECT id, customer_id, status, created_at
FROM orders
WHERE created_at >= NOW() - INTERVAL '7 days'
AND status = 'paid';
Index suggestion: (status, created_at) or (created_at, status) depending on cardinality and sort needs. If you often ORDER BY created_at DESC LIMIT 50, prefer (status, created_at DESC).
Why it works
The column is left as-is, enabling an index seek on created_at and status.
Example 2 — Join and covering index
Goal: List recent paid orders with customer name.
Potentially slow
SELECT o.id, c.name, o.total_cents
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'paid' AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 100;
Risk: If orders filters are not indexed, DB scans many rows, then sorts.
Improved
-- Narrow columns; add a supporting index
-- Suggested composite index on orders: (status, created_at DESC, customer_id) INCLUDE (total_cents)
SELECT o.id, c.name, o.total_cents, o.created_at
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'paid' AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 100;
Why: The index matches WHERE and ORDER BY, and may cover total_cents to avoid extra lookups. Join uses c.id which is typically primary-key indexed.
Example 3 — Fast pagination (keyset seek)
Goal: Page through the latest paid orders.
Slow for deep pages
SELECT id, created_at
FROM orders
WHERE status = 'paid'
ORDER BY created_at DESC, id DESC
OFFSET 50000 LIMIT 50;
Problem: The database still scans/steps over 50k rows to reach page 1001.
Fast keyset pagination
-- For the next page, pass the last seen (created_at, id)
SELECT id, created_at
FROM orders
WHERE status = 'paid'
AND (created_at, id) < (:last_created_at, :last_id)
ORDER BY created_at DESC, id DESC
LIMIT 50;
Index: (status, created_at DESC, id DESC).
Example 4 — Aggregation after pre-filtering
Goal: Daily revenue last 30 days for paid orders.
Efficient aggregation
SELECT DATE(created_at) AS day, SUM(total_cents) AS revenue
FROM orders
WHERE status = 'paid' AND created_at >= NOW() - INTERVAL '30 days'
GROUP BY DATE(created_at)
ORDER BY day;
Notes: The WHERE clause is sargable; GROUP BY happens on a much smaller set. If DATE() prevents index use for GROUP BY, that is fine; the seek happens in WHERE first. For very high volume, consider a materialized view or summary table updated incrementally.
Practical projects
- Optimize a slow listing endpoint: rewrite the query to be sargable, add a fitting index, measure before/after with EXPLAIN/EXPLAIN ANALYZE.
- Implement keyset pagination for a feed: replace OFFSET with seek using (timestamp, id).
- Create a covering index for your top 3 queries: ensure SELECTed columns are covered to avoid lookups.
- Build a monthly reporting query: pre-filter early and verify the execution plan avoids large sorts.
Exercises
Do these tasks, then compare with the solutions below. Track your progress with the checklist.
- [ ] Exercise 1: Make the date filter sargable.
- [ ] Exercise 2: Choose the best composite index and adjust the query.
- [ ] Exercise 3: Replace OFFSET pagination with keyset pagination.
Exercise 1 — Make the date filter sargable (ex1)
You have a products table: id, name, price_cents, updated_at. Rewrite this query to be sargable and propose an index:
SELECT id, name
FROM products
WHERE DATE(updated_at) = CURRENT_DATE - INTERVAL '1 day';
Exercise 2 — Composite index choice (ex2)
For orders(status, customer_id, created_at, total_cents), you often run:
SELECT id, customer_id, total_cents
FROM orders
WHERE status = 'paid' AND customer_id = :cid
ORDER BY created_at DESC
LIMIT 20;
Pick an index and adjust the query if needed to maximize index usage and avoid sorts.
Exercise 3 — Keyset pagination (ex3)
Current query:
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
OFFSET :page * 20 LIMIT 20;
Rewrite it to use keyset pagination. Assume you receive last_created_at and last_id from the previous page.
Common mistakes and how to self-check
- Using functions on indexed columns in WHERE: move constants to the right side and compare raw columns.
- Selecting too many columns: only select what you need; consider covering indexes.
- Filtering after joining: filter early in the driving table.
- Relying on OFFSET for deep pages: switch to keyset pagination.
- Wrong index order: put the most selective and equality conditions first, then range, then order-by columns.
- Ignoring execution plans: always check if you get an index seek and reasonable row estimates.
Self-check steps
- Run EXPLAIN/EXPLAIN ANALYZE: confirm seeks over scans, and low actual rows.
- Time before/after (3–5 runs) and compare medians.
- Check that the index supports WHERE and ORDER BY together.
- Validate identical results count and correctness.
Mini challenge
You need the latest 100 shipped orders for a given customer, newest first, with total and created_at. Design the query and one composite index. Hint: equality columns first, then sort columns.
One possible approach
-- Index: (customer_id, status, created_at DESC, id DESC) INCLUDE (total_cents)
SELECT id, total_cents, created_at
FROM orders
WHERE customer_id = :cid AND status = 'shipped'
ORDER BY created_at DESC, id DESC
LIMIT 100;
Learning path
- Write sargable WHERE clauses and confirm via plans.
- Design composite and covering indexes to match queries.
- Master joins: join order, join keys, and filtering.
- Implement keyset pagination.
- Optimize aggregations and reporting queries.
- Measure, iterate, and document before/after impact.
Next steps
- Collect your top 5 slow queries and apply the techniques here.
- Create migration scripts to add the most impactful indexes.
- Share a short perf report with timings and plan changes.
Quick Test and progress saving
The quick test is available to everyone. If you are logged in, your progress will be saved automatically.