How to learn dbt Generic Tests Not Null Unique Relationships for Testing Data Pipelines in Analytics Engineer for free

Why this matters

As an Analytics Engineer, you are responsible for reliable, trustworthy datasets. dbt’s built-in generic tests—not_null, unique, and relationships—are the fastest way to enforce key data constraints directly in your transformation code. Real tasks you’ll do include:

Preventing null IDs and broken joins before dashboards break.
Guaranteeing primary key uniqueness in dimension models.
Enforcing foreign key integrity between fact and dimension tables.
Catching upstream data quality issues early in CI.

Quick test is available to everyone; log in to save your progress.

Concept explained simply

not_null: A column must have a value in every row.
unique: No two rows share the same value in this column.
relationships: Every value in a foreign key column exists in the referenced primary key column.

Mental model

Think of these three tests as locks on a door:

not_null: the key exists.
unique: the key is one-of-a-kind.
relationships: the key actually opens the lock (matches a real parent record).

Core syntax and patterns

Define tests in your YAML (often schema.yml) alongside models:

version: 2

models:
  - name: customers
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique

  - name: orders
    columns:
      - name: customer_id
        tests:
          - relationships:
              to: ref('customers')
              field: customer_id

Useful configuration

# Filter rows under test (great for soft-deletes or active-only)
- not_null:
    where: "is_active = true"

# Downgrade to warnings during rollout (still visible in CI)
- unique:
    severity: warn

# Store failing rows for easy debugging
- relationships:
    to: ref('customers')
    field: customer_id
    store_failures: true

Notes:

unique does not enforce not-null. Add not_null as a separate test.
Place relationships on the child model’s foreign key column.
Failed-row tables are created in your target schema (prefixed with dbt_test__) when store_failures: true is set.

Worked examples

Example 1 — Primary key constraints with not_null + unique

version: 2

models:
  - name: stg_customers
    columns:
      - name: customer_id
        description: Primary key for customers
        tests:
          - not_null
          - unique

Why: If customer_id is null or duplicated, downstream facts and dimensions become unreliable.

Example 2 — Foreign key integrity with relationships

version: 2

models:
  - name: fct_orders
    columns:
      - name: customer_id
        description: FK to customers
        tests:
          - relationships:
              to: ref('dim_customers')
              field: customer_id

Why: Ensures every order references a real customer in dim_customers.

Example 3 — Targeted testing with where + severity

version: 2

models:
  - name: dim_customers
    columns:
      - name: email
        tests:
          - not_null:
              where: "is_active = true"   # only active customers must have email
              severity: warn              # start as a warning in rollout

Why: You can enforce stricter rules for a subset of rows, then tighten later by switching to error.

Tip: Composite uniqueness

If your natural key spans multiple columns (e.g., shop_id + sku), built-in unique works per single column. For multi-column uniqueness, use a surrogate key column (e.g., a hashed key) and test unique on that single column.

How to run and debug

Run all tests: dbt test
Run tests for a model: dbt test -s dim_customers
Run only schema tests: dbt test --select test_type:schema

Debugging tips:

Set store_failures: true to write failing rows to your warehouse.
Use where: to reproduce issues for a specific slice (e.g., order_date >= current_date - 7).
Review the compiled SQL in target/ or your warehouse query history to learn how dbt built the test query.

Common mistakes and self-check

Mistake: Relying on unique to prevent nulls. Self-check: Pair unique with not_null on the same column.
Mistake: Putting relationships on the parent table. Self-check: It belongs on the child’s FK column.
Mistake: Testing inactive/soft-deleted rows unintentionally. Self-check: Add where: to match business rules.
Mistake: Missing keys in staging models. Self-check: Add not_null on IDs early (in staging) to catch upstream issues.
Mistake: No failing-row capture. Self-check: Use store_failures: true while investigating.

Practical projects

Customer 360: Enforce primary key and FK tests across dim_customers and fct_orders; roll out gradually with severity: warn.
Subscription health: Ensure unique subscription_id and relationships from fct_invoices.subscription_id -> dim_subscriptions.subscription_id.
Marketing attribution: Require not_null campaign_id on fact tables and relationships to dim_campaigns; use where to exclude test campaigns.

Exercises

Complete these in a sample dbt project. They mirror the exercises below.

Exercise 1 — Core constraints for customers and orders

Add not_null and unique to customers.customer_id.
Add relationships to orders.customer_id referencing customers.customer_id.
Run dbt test and inspect failures (enable store_failures: true if needed).

Exercise 2 — Targeted tests with where + severity

For customers.email, enforce not_null only where is_active = true.
Set severity: warn for the first run.
Flip to severity: error after reviewing failures.

Self-check checklist

Did you apply not_null and unique to the same key column?
Is relationships defined on the child model’s FK column?
Do your where: conditions reflect your business rules?
Can you locate failing rows (either in logs or stored tables)?

Mini challenge

You notice occasional orders with customer_id = null and some orders pointing to deleted customers. Propose and implement tests to prevent both issues, then make sure new PRs can’t be merged when these issues reappear. As a stretch goal, apply a where: so only orders in the last 30 days block merges.

Who this is for

Analytics Engineers and BI Developers who maintain modeled layers for analytics.
Data Analysts who want reliable tables for dashboards and experiments.

Prerequisites

Basic SQL (SELECT, JOIN, WHERE, GROUP BY).
Intro-level familiarity with dbt project structure (models, YAML).

Learning path

Understand generic tests: not_null, unique, relationships.
Add tests to staging and core models, then run dbt test.
Refine with where, severity, and store_failures.
Automate in CI and enforce for new PRs.

Next steps

Broaden coverage to all primary and foreign keys.
Create surrogate keys for composite uniqueness and test them.
Adopt a rollout plan: start with warnings, then move to enforced errors.

Menu

dbt Generic Tests Not Null Unique Relationships

Table of Contents