Why this matters
Macros and Jinja power most of dbt’s flexibility. As an Analytics Engineer, you will:
- Remove repetition by generating SQL (e.g., dynamic column lists, reusable cleaning expressions).
- Build safer, parameterized models (e.g., environment-aware filters and switches).
- Create consistent logic across hundreds of models with one macro change.
- Speed up incremental pipelines with context-aware conditions.
Concept explained simply
dbt uses Jinja, a templating language, to turn templates into final SQL. You write Jinja + SQL; dbt renders it into pure SQL, then your warehouse runs that SQL.
Mental model
- Think of Jinja as a pre-processor. It fills in blanks, loops, and conditions before SQL runs.
- Macros are like functions that return SQL snippets.
- Rendering happens at run/compile time; the warehouse only sees rendered SQL.
Key building blocks you’ll use
- Output value:
{{ ... }}(e.g.,{{ var('schema_name', 'public') }}) - Logic/block:
{% ... %}(e.g.,{% if is_incremental() %} ... {% endif %}) - Comment:
{# ... #} - Variables:
{% set cols = ['id','email'] %} - If/else:
{% if var('use_limit', false) %} limit 100 {% else %} -- no limit {% endif %} - Loops:
{% for c in ['id','email','created_at'] %} {{ ', ' if not loop.first }}{{ c }} {% endfor %} - Define a macro:
{% macro clean_email(col) %} lower(trim({{ col }})) {% endmacro %} - Call a macro:
{{ clean_email('email') }}
Useful dbt context helpers
ref('model_name')andsource('src','table')— reference models/sourcesvar('name', default)— project variablesthis— the relation for the current modelis_incremental()— true when building an incremental model
Tip: Use run_query only inside macros during execution, not in code that runs during parsing.
Worked examples
Example 1 — Reusable cleaner for emails
{% macro clean_email(col) %}
lower(trim({{ col }}))
{% endmacro %}
-- Use in a model
select
{{ clean_email('email') }} as email_cleaned,
id
from {{ ref('stg_users') }}
Result: Every model that needs email cleaning calls the same macro—consistency and easy maintenance.
Example 2 — Generate CASE counts from a list
{% set statuses = ['new','active','churned'] %}
select
user_id,
{% for s in statuses %}
sum(case when status = '{{ s }}' then 1 else 0 end) as status_{{ s }}{{ ',' if not loop.last }}
{% endfor %}
from {{ ref('user_status_events') }}
group by 1
Result: Add/remove a status in one place without rewriting SQL.
Example 3 — Incremental filter
select *
from {{ ref('stg_orders') }}
{% if is_incremental() %}
where updated_at >= (select max(updated_at) from {{ this }})
{% endif %}
Result: On full refresh there’s no filter; on incremental runs, only new/changed rows are processed.
How to write and use a macro (quick steps)
- Create a file under
macros/(e.g.,macros/cleaning.sql). - Define the macro:
{% macro normalize_name(col, fill='unknown') %} coalesce(nullif(trim({{ col }}), ''), '{{ fill }}') {% endmacro %} - Call it in any model:
select {{ normalize_name('customer_name') }} as customer_name from {{ ref('stg_customers') }} - Parameterize behavior with
var:{% set fallback = var('name_fallback', 'unknown') %} select {{ normalize_name('customer_name', fallback) }} as customer_name from {{ ref('stg_customers') }} - Compile or run to see rendered SQL in your target/compiled files.
Tips that prevent headaches
- Return plain SQL from macros. The warehouse must be able to run the final string.
- Quote string literals inside macros (e.g.,
'unknown'), not identifiers. - Avoid heavy logic in templates. Keep macros small and composable.
Exercises
Do these hands-on tasks. You can check solutions below each task. The quick test is available to everyone; log in to save your progress.
Exercise 1 — Build a slugify macro
Goal: Create a macro that turns a name column into a slug: lowercase, trimmed, spaces to underscores, remove dashes.
- File:
macros/text.sql - Macro name:
slugify_name(col) - Use in a model to produce
name_slug.
Expected result
The macro should render a SQL expression equivalent to:
lower(replace(replace(trim(name), ' ', '_'), '-', ''))
Show solution
{% macro slugify_name(col) %}
lower(
replace(
replace(trim({{ col }}), ' ', '_'),
'-', ''
)
)
{% endmacro %}
-- Usage
select {{ slugify_name('name') }} as name_slug from {{ ref('stg_people') }}
Exercise 2 — Dynamic status counters
Goal: Using a Jinja loop, generate columns for statuses new, active, churned as separate counters in the SELECT list.
Expected result
Rendered SQL should include three columns like:
sum(case when status = 'new' then 1 else 0 end) as status_new,
sum(case when status = 'active' then 1 else 0 end) as status_active,
sum(case when status = 'churned' then 1 else 0 end) as status_churned
Hints
- Define a list:
{% set statuses = [...] %} - Use
{{ ',' if not loop.last }}to separate columns.
Show solution
{% set statuses = ['new','active','churned'] %}
select
account_id,
{% for s in statuses %}
sum(case when status = '{{ s }}' then 1 else 0 end) as status_{{ s }}{{ ',' if not loop.last }}
{% endfor %}
from {{ ref('account_status_events') }}
group by 1
Exercise 3 — Add incremental safety
Goal: In a model that reads stg_payments, add an incremental filter on updated_at using is_incremental().
Expected result
On incremental runs, compiled SQL includes:
where updated_at >= (select max(updated_at) from <this model>)
On full refresh, no such filter is present.
Show solution
select *
from {{ ref('stg_payments') }}
{% if is_incremental() %}
where updated_at >= (select max(updated_at) from {{ this }})
{% endif %}
Self-check checklist
- I can explain the difference between
{{ }}and{% %}. - I can define and call a macro with parameters.
- I used a loop to generate repeated SQL without copy-paste.
- I guarded logic with
is_incremental()where appropriate. - My macros return valid SQL strings.
Common mistakes and how to self-check
- Mixing identifiers and strings. Fix: Use quotes for string literals. Keep column/table names unquoted unless your warehouse requires quoting identifiers.
- Trailing commas in loops. Fix: Use
{{ ',' if not loop.last }}in SELECT lists. - Heavy logic inside templates. Fix: Keep macros focused; prefer simple, composable helpers.
- Using
run_queryduring parsing. Fix: Only call it inside macros that execute during a run (not in configs that parse early). - Not testing macros in isolation. Fix: Create a tiny throwaway model to render and inspect output SQL.
Practical projects
- Staging column standardizer: a macro that outputs a cleaned list of columns (trim strings, lower emails, nullify blanks) for any staging table.
- Status metrics generator: produce N status counters from a config variable list, used across 3+ models.
- Incremental framework: a small pattern macro that injects incremental WHERE clauses based on timestamp or surrogate key.
Who this is for
- Analytics Engineers building repeatable transformations.
- Data Analysts who want DRY, consistent SQL.
- Engineers maintaining many similar models across domains.
Prerequisites
- Comfortable writing basic SQL (SELECT, CASE, WHERE, GROUP BY).
- Basic dbt model structure (models, refs, sources).
Learning path
- Start with Jinja basics in dbt (this lesson).
- Use macros in one staging model, then expand to shared patterns.
- Add environment variables and project variables for configurability.
- Introduce incremental logic where it measurably reduces cost/time.
Next steps
- Extract any repeated SQL into a macro today.
- Parameterize it with a variable list for flexibility.
- Run a small backfill and verify compiled SQL looks as expected.
Mini challenge
Build a macro safe_divide(numer, denom, zero_value=0) that renders case when denom = 0 or denom is null then zero_value else numer / denom end. Use it to create a conversion_rate metric in a model.
Progress & test info
The Quick Test below is available to everyone. If you log in, your progress and scores are saved to your account.
Quick Test
Ready? Take the quick test below to check your understanding.