Why SQL matters for Business Analysts
SQL lets you pull reliable answers straight from source data. As a Business Analyst, youβll use SQL to explore customer behavior, validate KPIs, build datasets for dashboards, and provide stakeholders with trustworthy numbers. Mastering the basics unlocks faster analysis, cleaner reports, and more confidence in decisions.
Who this is for
- Aspiring or current Business Analysts who want hands-on data access
- Analysts moving from spreadsheets to databases
- PMs/Operations folks who need quick, accurate data pulls
Prerequisites
- Comfort with basic math and spreadsheets
- Familiarity with rows/columns and simple charts
- No prior coding required
Learning path
Milestone 1 β Read data (SELECT, WHERE, ORDER BY)
- Understand tables, schemas, and primary keys
- Practice selecting columns, filtering rows, sorting results
Milestone 2 β Summarize data (COUNT, SUM, AVG, GROUP BY, HAVING)
- Compute totals, averages, and rates
- Group results and filter aggregated groups
Milestone 3 β Combine data (JOINs, NULLs)
- Use INNER JOIN for matching records; LEFT JOIN to keep all from the left table
- Handle missing values with COALESCE and be mindful of NULL logic
Milestone 4 β Make results useful (date/text functions, readability, validation, export)
- Extract month/year, clean text, and standardize formats
- Write readable queries with aliases and consistent style
- Run sanity checks and export clean datasets for reporting
Worked examples (with mini datasets)
Assume two tables:
customers(id, name, segment)
orders(id, customer_id, order_date, amount)
-- Sample rows
-- customers: (1, 'Ava', 'SMB'), (2, 'Ben', 'Enterprise'), (3, 'Chen', 'SMB')
-- orders: (101,1,'2024-01-10',120.00), (102,2,'2024-01-15',300.00),
-- (103,2,'2024-02-01',NULL), (104,3,'2024-02-10',80.00)
1) Select and filter
-- All Enterprise customers
SELECT id, name
FROM customers
WHERE segment = 'Enterprise';
-- Orders in Jan 2024 only
SELECT id, customer_id, amount
FROM orders
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01';
Tip: Use >= start and < next month for safe month boundaries.
2) Sort results
-- Highest amounts first; tie-breaker by order_date ascending
SELECT id, customer_id, amount, order_date
FROM orders
ORDER BY amount DESC NULLS LAST, order_date ASC;
NULLS LAST (if supported) ensures blanks donβt appear at the top when sorting descending.
3) Aggregations and groups
-- Total and average order amount per customer
SELECT customer_id,
COUNT(*) AS order_count,
SUM(amount) AS total_amount,
AVG(amount) AS avg_amount
FROM orders
GROUP BY customer_id;
-- Only customers with total >= 200
SELECT customer_id,
SUM(amount) AS total_amount
FROM orders
GROUP BY customer_id
HAVING SUM(amount) >= 200;
HAVING filters after aggregation; WHERE filters rows before aggregation.
4) Join customers and orders
-- Orders with customer names (only matches)
SELECT o.id AS order_id, c.name, o.amount
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;
-- All customers, even those without orders
SELECT c.id, c.name, o.amount
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id;
Use LEFT JOIN to include customers with no orders (amount will be NULL).
5) Handle NULLs
-- Replace missing order amounts with 0 for reporting totals
SELECT COALESCE(amount, 0) AS amount_clean
FROM orders;
-- Count non-null amounts vs. all rows
SELECT COUNT(amount) AS non_null_amounts,
COUNT(*) AS total_rows
FROM orders;
COUNT(column) ignores NULLs; COUNT(*) counts rows.
6) Date and text functions
-- Monthly orders
SELECT EXTRACT(YEAR FROM order_date) AS yr,
EXTRACT(MONTH FROM order_date) AS mo,
COUNT(*) AS orders
FROM orders
GROUP BY yr, mo
ORDER BY yr, mo;
-- Customers whose names start with 'A' (case-insensitive approach)
SELECT *
FROM customers
WHERE UPPER(name) LIKE 'A%';
Drills and quick exercises
- Return the top 5 highest order amounts with their customer names.
- List each segment and total revenue from that segment.
- Show customers with zero orders (one row per customer).
- Compute average order amount per month in 2024.
- Replace NULL amounts with 0 and recompute totals.
- Reformat a query with clear aliases and aligned keywords.
- Add a sanity check that total revenue equals the sum of segment revenues.
Common mistakes and debugging tips
- Confusing WHERE vs HAVING: Use WHERE for row filters, HAVING for group filters after aggregation.
- Dropping unmatched rows unintentionally: INNER JOIN removes non-matching rows; use LEFT JOIN to keep the left table complete.
- Sorting with NULLs: Decide whether NULLS FIRST/LAST is needed to avoid odd rankings.
- Miscounting NULLs: COUNT(column) ignores NULL; COUNT(*) counts all rows.
- Ambiguous columns after JOINs: Always qualify columns (e.g., c.id vs o.id) and use aliases.
- Date boundaries: Use half-open intervals (>= start and < next period) to avoid missing the last day.
- Unvalidated outputs: Cross-check totals two ways (detail vs summary) and spot-check a few raw rows.
Mini project: Monthly Sales Snapshot
Goal: Build a monthly sales table by customer segment with counts and revenue, ready for export.
- Create a base orders set for the last 12 months using a WHERE date range.
- JOIN customers to orders for segment and name.
- Compute metrics per segment and month: order_count, unique_customers, total_amount, avg_amount.
- Handle NULL amounts with COALESCE(amount, 0).
- Validate: sum of segment totals equals overall monthly total.
- Order results by year, month, then segment; alias columns clearly.
- Export to CSV/Excel from your SQL tool.
Starter query
WITH base AS (
SELECT o.*, c.segment
FROM orders o
INNER JOIN customers c ON c.id = o.customer_id
WHERE o.order_date >= DATE '2024-01-01' AND o.order_date < DATE '2025-01-01'
)
SELECT EXTRACT(YEAR FROM order_date) AS yr,
EXTRACT(MONTH FROM order_date) AS mo,
segment,
COUNT(*) AS order_count,
COUNT(DISTINCT customer_id) AS unique_customers,
SUM(COALESCE(amount, 0)) AS total_amount,
AVG(COALESCE(amount, 0)) AS avg_amount
FROM base
GROUP BY yr, mo, segment
ORDER BY yr, mo, segment;
Practical projects
- Churn signals: List customers with no orders in the last 90 days and their last order date.
- Product mix: Calculate revenue share by product category per quarter.
- Sales pipeline hygiene: Identify orders with NULL or zero amounts and flag them.
- Marketing impact: Compare average order amount before vs after a campaign window.
Subskills
- Understanding Tables And Schemas β Know how data is organized; identify keys, data types, and relationships.
- Selecting Data With Select β Choose specific columns, alias them, and limit results for inspection.
- Filtering With Where β Use comparisons, AND/OR, IN, BETWEEN, and date ranges.
- Sorting With Order By β Apply ascending/descending order and handle NULLS FIRST/LAST where available.
- Aggregations Count Sum Avg Min Max β Summarize metrics for quick KPIs.
- Group By And Having β Aggregate by dimensions; filter aggregated groups safely.
- Joining Tables Inner Join Left Join β Combine datasets and avoid unintended row loss.
- Working With Null Values β Understand NULL logic; use COALESCE and safe comparisons.
- Basic Date And Text Functions β Extract periods, clean strings, and standardize formats.
- Writing Readable Queries β Use aliases, consistent casing, and neat formatting.
- Validating Results With Sanity Checks β Reconcile totals and inspect sample rows.
- Exporting Results For Reporting β Prepare clean, labeled, and sorted datasets for delivery.
Next steps
- Repeat the mini project on a different dataset (e.g., marketing leads, support tickets).
- Practice writing the same metric three ways and ensure the numbers match.
- Move on to joins across more tables, window functions, and CTE patterns.