luvv to helpDiscover the Best Free Online Tools
Topic 3 of 8

Null Handling

Learn Null Handling for free with explanations, exercises, and a quick test (for Data Visualization Engineer).

Published: December 28, 2025 | Updated: December 28, 2025

Why this matters

As a Data Visualization Engineer, your charts must be accurate and readable. NULLs (missing/unknown values) can break totals, distort rates (division by zero), hide categories, and produce confusing labels like blank strings in legends. Knowing how to detect, fill, and safely compute with NULLs keeps dashboards trustworthy.

  • Remove blank labels by replacing NULLs with friendly text.
  • Prevent division errors when a denominator is NULL or zero.
  • Find unmapped items after joins to fix data quality issues.
  • Compute correct counts and averages when data is incomplete.

Who this is for

  • Data Visualization Engineers building BI dashboards and reports.
  • Analysts preparing datasets for Looker, Tableau, Power BI, or custom charts.
  • Analytics Engineers ensuring semantic layer correctness.

Prerequisites

  • Basic SELECT, WHERE, GROUP BY, ORDER BY.
  • JOIN basics (especially LEFT JOIN).
  • Simple CASE expressions.

Concept explained simply

NULL means "unknown" or "not applicable". It is not zero, not an empty string, and not false. Comparisons with NULL return unknown, not true or false. This affects filtering and calculations.

Mental model

  • Think of NULL as a covered box: you don’t know what’s inside. Any comparison with a covered box stays unknown.
  • Aggregates like SUM/AVG skip covered boxes. COUNT(col) counts only opened boxes; COUNT(*) counts all rows.
  • To open the box or provide a fallback, use COALESCE.
Key facts in one glance
  • Check NULL: IS NULL, IS NOT NULL
  • Fallback: COALESCE(x, y, z)
  • Avoid divide-by-zero: denominator -> NULLIF(den, 0)
  • Anti-join check after LEFT JOIN: WHERE right.id IS NULL
  • Filters with NULLs: status <> 'active' won’t match NULL; add OR status IS NULL if needed

Core patterns and syntax

Detect NULLs

-- Correct
WHERE city IS NULL
WHERE city IS NOT NULL

-- Incorrect: this never matches NULL
WHERE city = NULL

Fill or label NULLs

SELECT COALESCE(city, 'Unknown') AS city_label FROM customers;
-- COALESCE returns the first non-NULL argument

Safe math (avoid errors)

-- Safe rates: if denominator is 0, make it NULL (rate becomes NULL)
SELECT revenue / NULLIF(orders, 0) AS avg_order_value FROM daily_stats;

Counts and averages

-- COUNT(*) counts rows (including NULL values in columns)
-- COUNT(col) counts non-NULL values only
-- SUM/AVG ignore NULL inputs

Conditional logic

CASE WHEN status IS NULL THEN 'Unknown' ELSE status END AS status_label

Joins and missing mappings

-- Find products without a category mapping
SELECT p.id
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
WHERE pc.product_id IS NULL;

Worked examples

Example 1: Friendly labels for charts

Goal: show a market share donut by city, labeling missing city as "Unknown".

SELECT COALESCE(city, 'Unknown') AS city_label, COUNT(*) AS users
FROM users
GROUP BY COALESCE(city, 'Unknown')
ORDER BY users DESC;

Why it works: COALESCE groups NULLs under a readable label, avoiding empty legend entries.

Example 2: Safe conversion rate

Goal: conversion_rate = conversions / visits, but visits can be zero or NULL.

SELECT date,
       conversions,
       visits,
       conversions / NULLIF(visits, 0) AS conversion_rate
FROM daily_web;

Why it works: NULLIF(visits, 0) returns NULL when visits = 0, preventing divide-by-zero. The rate becomes NULL for those days.

Example 3: Find unmapped items after a LEFT JOIN

Goal: list products that lack a category mapping.

SELECT p.product_id, p.product_name
FROM products p
LEFT JOIN product_category_map m ON p.product_id = m.product_id
WHERE m.category_id IS NULL;

Why it works: Unmatched right-side rows yield NULLs; filtering on IS NULL isolates missing mappings.

Example 4: COUNT differences

Goal: count emails vs. total users.

SELECT COUNT(*) AS total_users, COUNT(email) AS users_with_email
FROM users;

Why it works: COUNT(email) excludes NULL emails, while COUNT(*) includes all rows.

Exercises

Try these hands-on tasks. Then verify with the solutions. A checklist is included to self-review.

  1. ex1 — Label NULLs for dimension clarity: In customers(id, name, city), return id, name, and city_label where city_label = COALESCE(city, 'Unknown').
  2. ex2 — Safe rate computation: In sales_by_day(d, total_revenue, total_orders), compute avg_order_value = total_revenue / NULLIF(total_orders, 0).
  3. ex3 — Missing mapping audit: Using products(p_id, brand) and product_categories(p_id, category_id), list p_id, brand where no category exists, and return a total count.
Checklist: Did you cover these?
  • Used IS NULL / IS NOT NULL instead of = NULL.
  • Used COALESCE to create readable labels.
  • Protected divisions with NULLIF.
  • Used LEFT JOIN + WHERE right.col IS NULL for unmapped records.
  • Verified COUNT(col) vs COUNT(*).

Common mistakes and self-check

  • Using = NULL instead of IS NULL. Self-check: Search your query for "= NULL" — replace it.
  • Forgetting to handle zero denominators. Self-check: Every division has a NULLIF? If not, add it.
  • Empty strings vs NULL. Self-check: If you see '' values, consider NULLIF(field, '') before COALESCE. Note: Behavior can vary by database; treat this as a design choice in your pipeline.
  • Incorrect anti-join filter. Self-check: After LEFT JOIN, your WHERE must check the right-side column IS NULL.
  • Unexpected filters with NULL. Self-check: If status <> 'active' is used, add OR status IS NULL if you intend to include unknowns.

Mini challenge

Dataset:

support_tickets(id, created_at, resolved_at, channel, agent_id)
-- resolved_at can be NULL if not resolved
  • Task A: resolution_rate per channel = resolved_tickets / total_tickets (safe division).
  • Task B: bucket resolution_time_days into 'Under 1 day', '1-3 days', '3+ days', 'Unresolved'.
  • Task C: list channels with zero resolved tickets.
Hints
  • Resolved flag: CASE WHEN resolved_at IS NOT NULL THEN 1 END, or COUNT(resolved_at)
  • Rate: SUM(resolved_flag) / NULLIF(COUNT(*), 0)
  • Buckets: CASE on DATE_DIFF/EXTRACT function appropriate to your SQL dialect

Practical projects

  • Build a "Data Quality Watch" dashboard showing counts of NULLs across key dimensions and trends over time.
  • Create a KPI dataset where all rates are computed with NULLIF protection and all labels use COALESCE; feed it to a chart tool.
  • Design a join-audit report listing unmapped products or users after dimension joins, with counts by source system.

Learning path

  • Before: SELECT and WHERE basics; JOIN fundamentals.
  • Now: NULL handling patterns (this lesson).
  • Next: Window functions with NULLs (e.g., COALESCE in partitions), date handling, and data type casting safety.

Next steps

Practice the exercises, then take the Quick Test below to confirm understanding. The test is available to everyone; only logged-in users get saved progress.

Practice Exercises

3 exercises to complete

Instructions

Table: customers(id, name, city)

Return: id, name, city_label where city_label = COALESCE(city, 'Unknown'). Group optional: also show count by city_label.

Example input
id | name   | city
---+--------+-----------
1  | Alex   | NULL
2  | Priya  | Berlin
3  | Omar   | NULL
4  | Sofia  | Madrid

Expected: rows for Alex and Omar have city_label = 'Unknown'.

Expected Output
Columns: id, name, city_label (city_label uses 'Unknown' when city IS NULL). Optional aggregate: city_label, users_count.

Null Handling — Quick Test

Test your knowledge with 6 questions. Pass with 70% or higher.

6 questions70% to pass

Have questions about Null Handling?

AI Assistant

Ask questions about this tool