Why this matters
Analysts constantly combine tables: customers with orders, ads with clicks, products with inventory. Choosing the right JOIN controls which rows appear and how NULLs emerge. This affects KPIs, conversion rates, churn counts, and revenue totals. Use the correct join to avoid overcounting or losing critical data.
- Report revenue by customer including those with zero orders (LEFT JOIN).
- Show only pairs of matching records for quality checks (INNER JOIN).
- Audit mismatches between systems (FULL OUTER JOIN).
- Create option matrices (CROSS JOIN) for planning or testing.
Concept explained simply
Think of two lists of people: one with "who registered" and one with "who purchased". A JOIN answers: who appears in which list, and how do we align them?
SELECT ...
FROM A
[INNER|LEFT|RIGHT|FULL] JOIN B
ON A.key = B.key;
-- CROSS JOIN has no ON: SELECT ... FROM A CROSS JOIN B;
Quick definitions
- INNER JOIN: only rows where keys match in both tables.
- LEFT JOIN: all rows from the left table, matching from right; unmatched right becomes NULL.
- RIGHT JOIN: symmetric of LEFT; keep all from right.
- FULL OUTER JOIN: all rows from both sides; unmatched pieces become NULL.
- CROSS JOIN: every row of A combined with every row of B (Cartesian product).
Mental model
- INNER = intersection.
- LEFT = A plus matches; unmatched A rows carry NULLs from B.
- RIGHT = B plus matches; unmatched B rows carry NULLs from A.
- FULL = union of left-only, right-only, and matches; use COALESCE to pick non-NULL keys.
- CROSS = grid (rowsA × rowsB) — no ON condition.
Sample data used in examples
-- Customers
-- id | name | country
-- 1 | Alice | US
-- 2 | Bob | UK
-- 3 | Chen | CN
-- 4 | Dana | US
-- Orders
-- id | customer_id | amount
-- 101 | 1 | 120
-- 102 | 1 | 80
-- 103 | 3 | 50
Worked examples
1) INNER JOIN: customers who placed orders
SELECT c.name, o.id AS order_id, o.amount
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id
ORDER BY c.name, o.id;
Output:
name | order_id | amount
Alice | 101 | 120
Alice | 102 | 80
Chen | 103 | 50
Note: Bob and Dana are excluded because they have no orders.
2) LEFT JOIN: include customers without orders
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.name
ORDER BY c.name;
Output:
name | order_count
Alice | 2
Bob | 0
Chen | 1
Dana | 0
Tip: GROUP BY with LEFT JOIN commonly yields zeros for no-match groups.
3) FULL OUTER JOIN: audit mismatches across systems
SELECT COALESCE(c.id, o.customer_id) AS cust_key,
c.name,
o.id AS order_id
FROM customers c
FULL OUTER JOIN orders o
ON o.customer_id = c.id
ORDER BY cust_key, order_id;
Output:
cust_key | name | order_id
1 | Alice | 101
1 | Alice | 102
2 | Bob | NULL -- customer with no orders
3 | Chen | 103
4 | Dana | NULL -- customer with no orders
If a system had orders for a non-existent customer, you'd see NULL on the customer columns and a real order_id.
4) RIGHT JOIN: same as left join but keeping right
SELECT o.id AS order_id, c.name
FROM customers c
RIGHT JOIN orders o ON o.customer_id = c.id
ORDER BY order_id;
Practical tip: prefer LEFT JOIN by flipping table order for readability.
5) CROSS JOIN: create a combinations grid
-- Suppose sizes and colors
-- sizes(s) = ['S','M','L']
-- colors(c) = ['Red','Black']
SELECT s.size, c.color
FROM sizes s
CROSS JOIN colors c
ORDER BY s.size, c.color;
Output has 3 Ă— 2 = 6 rows.
When INNER and LEFT diverge
They differ whenever the right table has gaps. If every left row matches at least once, INNER and LEFT produce the same count. Verify by checking COUNT(*) vs COUNT(right.id) in a LEFT JOIN.
Exercises
Use the sample data provided above.
Exercise 1 (ex1): Compare INNER JOIN vs LEFT JOIN totals
Task: Write two queries:
- INNER JOIN customers and orders, returning name and order_count per customer.
- LEFT JOIN customers and orders, returning name and order_count per customer, including zero-order customers.
Record both result sets. What differences do you see and why?
- [ ] Query 1 uses INNER JOIN.
- [ ] Query 2 uses LEFT JOIN.
- [ ] Both queries GROUP BY name.
- [ ] You can explain the missing rows in INNER JOIN.
Exercise 2 (ex2): List customers with no orders
Task: Return customers who have never placed an order using LEFT JOIN and a NULL filter.
- [ ] Uses LEFT JOIN.
- [ ] Filters in WHERE with o.id IS NULL.
- [ ] Returns Bob and Dana only.
Need a hint?
- ex1: COUNT(o.id) counts only matched rows when o.id is NULL for non-matches.
- ex2: Left-side preservation + WHERE right.id IS NULL isolates non-matches.
Common mistakes and self-check
- Mistake: Using WHERE after a LEFT JOIN that filters on right table columns without IS NULL handling, unintentionally turning it into an INNER JOIN. Self-check: Compare row counts before and after the filter.
- Mistake: Duplicates due to one-to-many or many-to-many joins. Self-check: Inspect counts before/after join; use GROUP BY or pre-aggregate.
- Mistake: Joining on non-keys (e.g., names). Self-check: Ensure join keys are unique or appropriately deduplicated.
- Mistake: CROSS JOIN by accident (missing ON). Self-check: Always verify you wrote ON for non-cross joins.
- Mistake: Using RIGHT JOIN inconsistently. Self-check: Prefer LEFT JOIN and reorder tables for clarity.
Self-audit checklist
- Did I intentionally pick INNER/LEFT/RIGHT/FULL based on the question?
- Did I validate row counts and NULLs?
- Are join keys correct and indexed in production?
- Did I handle duplicates (GROUP BY, DISTINCT, or pre-aggregation) where needed?
Mini challenge
Write a single query that returns each customer and their total revenue, including zero for those without orders. Sort by total revenue desc, then name. Bonus: include a column has_orders (YES/NO).
Show a possible approach (no full solution)
SELECT c.name,
COALESCE(SUM(o.amount), 0) AS total_revenue,
CASE WHEN COUNT(o.id) > 0 THEN 'YES' ELSE 'NO' END AS has_orders
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.name
ORDER BY total_revenue DESC, c.name;
Who this is for
- Aspiring and current Data Analysts who join tables daily.
- Anyone building metrics from multiple sources.
Prerequisites
- Basic SELECT, WHERE, GROUP BY, ORDER BY.
- Familiarity with primary/foreign keys.
Learning path
- Start: Relational concepts and keys.
- Then: INNER and LEFT JOIN fundamentals.
- Next: FULL OUTER and RIGHT JOIN nuances.
- Finally: CROSS JOIN and safe use cases.
- Practice: Aggregations with joins, handling duplicates, NULL logic.
Practical projects
- Customer funnel: join signups, activations, and purchases; report conversion by country.
- Revenue completeness audit: FULL JOIN billing vs. orders; list mismatches.
- Size-color inventory matrix with CROSS JOIN for planning.
Next steps
- Master ON vs WHERE interactions with outer joins.
- Learn window functions for post-join analysis.
- Create reusable views for common joins.
Test & progress
Take the quick test below to check your understanding. Available to everyone. Logged-in learners get saved progress automatically.