How to learn dbt Concepts for Analytics Engineer for free

Why dbt Concepts matter for Analytics Engineers

dbt is the backbone of modern analytics engineering. It lets you version, test, document, and deploy SQL transformations as code. With dbt concepts, you can standardize warehouse modeling, guarantee data quality, document lineage to BI, and ship reliable analytics faster.

Turn raw data into trusted, modeled datasets.
Automate tests and documentation so teams trust dashboards.
Use environments and deployments to promote changes safely.

Who this is for

Analytics Engineers building warehouse models for BI and data apps.
Data/BI Analysts moving from ad-hoc SQL to governed pipelines.
Data Engineers who want standardized SQL transformations and lineage.

Prerequisites

Comfortable with SQL (joins, window functions, CTEs).
Basic Git workflows (branch, commit, merge).
Access to a SQL data warehouse (e.g., Postgres, Snowflake, BigQuery, Redshift) and dbt CLI.

Quick environment checklist

dbt installed and profiles.yml configured for your warehouse.
A project scaffolded (dbt init) and runs working (dbt debug).
At least one raw table available to model.

Learning path (practical roadmap)

Model layers — organize models into staging, intermediate, marts. Establish naming conventions and folder structure.
Sources and freshness — declare sources in YAML, reference with source(), and configure freshness.
Materializations — choose view/table; implement incremental for large facts.
Tests and packages — add generic tests in YAML; use a common package (e.g., utility macros).
Seeds and snapshots — load small CSVs as tables; version-change capture with snapshots.
Macros and Jinja — DRY up SQL with macros; use conditionals and loops safely.
Refs and dependency graph — link models with ref(), run by selection, explore DAG.
Documentation — write model and column descriptions; generate docs site.
Exposures — register dashboards/reports for lineage to BI.
Environments and deployments — use targets (dev/stage/prod), run/build, and CI-minded checks.

Milestone checks

You can run dbt build without errors.
You added at least 3 generic tests that pass.
Docs site shows sources → models → exposures lineage.

Worked examples

1) Declare a source with freshness and build a staging model

sources.yml

version: 2
sources:
  - name: raw
    database: analytics
    schema: raw
    freshness:
      warn_after: {count: 24, period: hour}
      error_after: {count: 48, period: hour}
    tables:
      - name: orders
        loaded_at_field: _ingested_at

models/staging/stg_orders.sql

with src as (
  select * from {{ source('raw', 'orders') }}
)
select 
  id as order_id,
  customer_id,
  status,
  total_amount::numeric as total_amount,
  order_timestamp
from src

Run freshness checks: dbt source freshness. Then build: dbt build -s models/staging/stg_orders.sql.

2) Incremental model with merge strategy

models/marts/fct_orders.sql

{{ config(
  materialized='incremental',
  unique_key='order_id',
  incremental_strategy='merge'
) }}

with base as (
  select * from {{ ref('stg_orders') }}
)
select 
  order_id,
  customer_id,
  status,
  total_amount,
  order_timestamp
from base
{% if is_incremental() %}
  where order_timestamp > (select coalesce(max(order_timestamp), '1900-01-01') from {{ this }})
{% endif %}

Build only this model and its parents: dbt build -s +fct_orders.

3) Snapshot to track slowly changing dimensions

snapshots/customer_snapshot.sql

{% snapshot customer_snapshot %}

{{ config(
  target_schema='snapshots',
  unique_key='customer_id',
  strategy='timestamp',
  updated_at='updated_at'
) }}

select * from {{ ref('stg_customers') }}

{% endsnapshot %}

Create/update snapshots: dbt snapshot.

4) Macro to standardize country names

macros/clean_country.sql

{% macro clean_country(country_col) %}
case
  when lower({{ country_col }}) in ('us','usa','united states') then 'United States'
  when lower({{ country_col }}) in ('uk','united kingdom','gb','great britain') then 'United Kingdom'
  else {{ country_col }}
end
{% endmacro %}

models/intermediate/int_customers.sql

select 
  customer_id,
  {{ clean_country('country') }} as country_standardized
from {{ ref('stg_customers') }}

5) Exposure to document a BI dashboard

exposures.yml

version: 2
exposures:
  - name: revenue_dashboard
    type: dashboard
    maturity: high
    owner:
      name: Analytics Team
      email: analytics@example.com
    depends_on:
      - ref('fct_orders')
    url: ''
    description: Key revenue KPIs for leadership

Rebuild docs so lineage includes the dashboard: dbt docs generate.

6) Tests in YAML (generic + relationships)

models/schema.yml

version: 2
models:
  - name: fct_orders
    description: Order fact table
    columns:
      - name: order_id
        tests:
          - not_null
          - unique
      - name: customer_id
        tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id

Run tests: dbt test -s fct_orders.

Skill drills (checklist)

Create a sources.yml with freshness and run dbt source freshness.
Build a staging model that casts types and renames columns predictably.
Convert a heavy model to materialized='incremental' with unique_key.
Add at least 3 generic tests to a fact or dimension model and get them passing.
Create one snapshot and confirm multiple records per business key over time.
Write a small macro and use it in two different models.
Generate docs and confirm lineage chains include your exposure.
Run only a subtree with dbt build -s +your_model.

Common mistakes and debugging tips

Missing or incorrect unique_key on incremental merge.
Tip: Ensure unique_key exists and is stable. Validate with dbt run -s your_model --full-refresh if needed.
Freshness not working.
Tip: Set loaded_at_field on the source table. Confirm type is timestamp and values update as expected.
Tests never run.
Tip: Generic tests go in YAML under the model/columns. Verify model names match and run dbt test -s model_name.
Macros not recognized.
Tip: Macros must be in the macros/ directory. Restart the run after creating new macros.
Broken DAG or circular refs.
Tip: Use ref() only downstream. Inspect dependencies via the docs site or by simplifying references.
Docs site missing descriptions.
Tip: Add description: at both model and column levels in YAML. Re-run dbt docs generate.

Debug run strategies

Use dbt compile to view rendered SQL in target/compiled.
Limit scope: dbt build -s +model_name.
Force clean state with --full-refresh for incremental models.

Mini project: Sales Analytics Warehouse in dbt

Goal: Build a small, production-like pipeline with sources → staging → intermediate → marts, tests, docs, snapshots, and an exposure.

Declare sources: raw.orders, raw.customers with freshness and loaded_at_field.
Staging models: Clean types, standardize names (stg_orders, stg_customers).
Intermediate: Join customers to orders for enriched facts (int_orders_enriched).
Marts: Create fct_orders (incremental) and dim_customers (table).
Tests: unique/not_null on keys; relationships from facts to dims.
Snapshot: Track changes to stg_customers with a timestamp strategy.
Macro: A small transformation (e.g., country standardization) used in two models.
Docs: Add descriptions for all models/columns and generate docs.
Exposure: Register your BI dashboard reading from fct_orders.
Deploy: Run in a dev target, then promote to prod by switching the target profile.

Success criteria

dbt build completes with 0 errors and all tests passing.
Docs lineage shows source → staging → mart → exposure.
A re-run with one new row only updates the incremental model minimally.

Practical projects (portfolio ready)

Customer 360 mart: Unify customers, orders, tickets; incremental fact + dimensional model with >10 tests.
Marketing attribution mart: Build sessionization and campaign attribution with macros and exposures.
Finance marts: Monthly P&L model from journal entries with seeds for exchange rates and snapshots for chart of accounts.

Subskills

Model Layers: Staging, Intermediate, Marts — Organize models into clear layers with naming standards and predictable ownership. Estimated: 45–90 min.
Materializations: View, Table, Incremental — Choose the right materialization for performance and cost; implement incremental safely. Estimated: 60–120 min.
Sources and Freshness — Declare sources in YAML, set loaded_at_field, and monitor freshness. Estimated: 45–90 min.
Seeds and Snapshots — Load small CSVs as managed tables; track changing records over time. Estimated: 45–90 min.
Macros and Jinja Basics — Reuse logic, reduce duplication, and render dynamic SQL safely. Estimated: 60–120 min.
Refs and Dependency Graph — Use ref() and selection syntax to control runs and understand lineage. Estimated: 45–90 min.
Exposures for BI Lineage — Register dashboards/reports to close the loop from data to decisions. Estimated: 30–60 min.
Environments and Deployments — Work with dev/stage/prod targets and promote changes confidently. Estimated: 45–90 min.
Documentation Generation in dbt — Add descriptions and build a docs site your stakeholders can trust. Estimated: 30–60 min.
dbt Tests and Packages Basics — Add generic tests and extend dbt with community packages. Estimated: 60–120 min.

Next steps

Refine incremental strategies (filters, partitioning, post-hooks).
Add more relationship tests and data contracts to stabilize interfaces.
Scale to a multi-environment deployment and automate dbt build on each change.

Skill exam

Ready to check your understanding? Take the exam below. Everyone can take it for free. Logged-in users have their progress saved.

Menu

dbt Concepts

Table of Contents

Why dbt Concepts matter for Analytics Engineers

Who this is for

Prerequisites

Learning path (practical roadmap)

Worked examples

Skill drills (checklist)

Common mistakes and debugging tips

Mini project: Sales Analytics Warehouse in dbt

Practical projects (portfolio ready)

Subskills

Next steps

Skill exam

dbt Concepts — Skill Exam

Topics

Seeds And Snapshots

Exposures For BI Lineage

Model Layers Staging Intermediate Marts

Macros And Jinja Basics

Refs And Dependency Graph

Materializations View Table Incremental

Sources And Freshness

Environments And Deployments

Documentation Generation In dbt

dbt Tests And Packages Basics

Have questions about dbt Concepts?

AI Assistant