Who this is for
This lesson is for Analytics Engineers, BI Developers, and Data Analysts who use dbt and want clear, trustworthy project documentation that auto-updates with the codebase.
Prerequisites
- Basic dbt project set up (profiles configured, you can run or compile models)
- Comfort with YAML and simple Jinja
- Ability to run terminal commands
Why this matters
In real teams, you will:
- Onboard teammates quickly with a browsable catalog of models, sources, and lineage
- Reduce tribal knowledge by embedding definitions and owners next to the code
- Support governance and quality by documenting tests, PII tags, and maturity
- Perform impact analysis using lineage before you change or deprecate a model
Concept explained simply
dbt can generate a static documentation site from your project. It reads your project graph (models, sources, tests, macros, exposures), plus database metadata (like columns and types), and renders a searchable site with lineage.
There are two main documentation inputs you control:
- Descriptions in YAML: Add descriptions to models, columns, sources, tests.
- Docs blocks: Reusable long-form text snippets you reference from YAML using
{{ doc('block_name') }}.
Plus, you can document exposures (dashboards, ML models, reports) and set owners, maturity, tags, and meta to enrich the docs.
Mental model
- Think of each dbt node as a "recipe card". Descriptions explain what it produces and how.
- Docs blocks are "reusable paragraphs" you can reference across multiple models.
- Exposures are "windows to the outside world" that show how your models power dashboards and apps.
- Generate = "bake the site". Serve = "open the site in a local viewer".
Core pieces you will use
1) Describe models and columns (schema.yml)
version: 2
models:
- name: orders
description: 'Orders at the order_id grain. Includes financial fields.'
columns:
- name: order_id
description: 'Primary key'
- name: total_amount
description: 'Gross order amount in USD before discounts'
2) Create a docs block and reference it
Create a file like docs/model_guidelines.md with:
{% docs financial_fields %}
Use these rules for financial fields:
- Values are in USD
- Negative values indicate refunds or chargebacks
- Nulls mean 'not applicable'
{% enddocs %}
Then reference it in YAML:
description: "{{ doc('financial_fields') }}"
3) Document sources
version: 2
sources:
- name: stripe
schema: raw
description: 'Raw Stripe tables'
tables:
- name: charges
description: 'Stripe charge-level records'
columns:
- name: id
description: 'Stripe charge id (primary key)'
4) Document exposures (dashboards, reports, ML)
version: 2
exposures:
- name: finance_kpis
type: dashboard
maturity: high
owner:
name: Finance Analytics
email: finance-analytics@example.com
depends_on:
- ref('mart_finance_kpis')
description: 'Executive KPIs powered by mart_finance_kpis'
5) Generate and preview the docs site
# Build static files (manifest.json, catalog.json, index.html)
dbt docs generate
# Preview locally (prints a local URL)
dbt docs serve
Worked examples
Example 1 — Add model and column descriptions
Goal: Document a model and key columns so teammates know what to trust.
version: 2
models:
- name: stg_orders
description: 'Staging model cleaning raw orders. One row per order.'
columns:
- name: order_id
description: 'Surrogate key of the order'
- name: customer_id
description: 'Business key referencing customers'
- name: order_status
description: 'Created/Paid/Refunded etc.'
Then:
dbt docs generate
dbt docs serve
Result: In the docs site, stg_orders shows your descriptions, its columns, and lineage.
Example 2 — Reuse a docs block across models
Goal: Avoid duplicating a long definition for 'payment_status'.
{% docs payment_status_def %}
Payment status logic:
- 'paid' = settled and captured
- 'pending' = authorized but not captured
- 'refunded' = refunded after capture
{% enddocs %}
Reference it from YAML in two models:
version: 2
models:
- name: fct_payments
columns:
- name: payment_status
description: "{{ doc('payment_status_def') }}"
- name: fct_orders
columns:
- name: payment_status
description: "{{ doc('payment_status_def') }}"
Result: Both models display the same authoritative definition.
Example 3 — Sources and an exposure with owner
Goal: Document where data comes from and who owns a BI asset.
version: 2
sources:
- name: app
schema: raw
description: 'Application DB replica'
tables:
- name: users
description: 'App users table'
columns:
- name: id
description: 'User id'
- name: email
description: 'User email (sensitive)'
exposures:
- name: user_growth_dashboard
type: dashboard
maturity: medium
owner:
name: Growth Team
email: growth@example.com
depends_on:
- ref('mart_user_growth_daily')
description: 'Tracks signups, activations, and retention.'
Result: The docs site shows the source tables, the dashboard exposure, lineage, and owner contact.
Step-by-step: generate docs locally
{% docs dimension_table %}
A dimension table contains descriptive attributes used for filtering and grouping facts.
{% enddocs %}
description: "{{ doc('dimension_table') }}"
dbt docs generate
This builds the static site files (including catalog and manifest artifacts).
dbt docs serve
Open the printed local address to browse the docs, search, and inspect lineage.
Exercises
You can complete these locally. Everyone can take the quick test; only logged-in users will have progress saved.
- [ ] Exercise 1: Add model and column descriptions, generate docs, and verify they appear.
- [ ] Exercise 2: Create and reuse a docs block across two models.
Common mistakes and self-check
Mistake: Using doc() without defining a docs block
Fix: Create a docs block with the exact same name inside a .md file in your project.
Mistake: YAML indentation or quoting errors
Fix: Use 2-space indentation. Quote Jinja strings correctly, e.g., description: "{{ doc('name') }}".
Mistake: Expecting docs to show fresh schema after database changes without regenerating
Fix: Run dbt docs generate again to refresh catalog and manifest.
Self-check tips
- Search the docs site for a term you added; confirm it appears exactly once or as many times as expected.
- Click a model and verify column descriptions and data types.
- Open the lineage view to confirm upstream sources and downstream exposures.
Practical projects
- Document a small e-commerce project: sources (payments, orders), staging, marts, and one finance dashboard exposure.
- Create a reusable glossary with docs blocks for key business terms (customer, churn, active) and reference them in multiple models.
- Add owners, maturity, and PII notes via descriptions/meta for sensitive columns; ensure they are visible in the docs site.
Learning path
- Before: dbt project structure, models, and tests
- Now: Documentation generation (this subskill)
- Next: Exposures, metadata conventions, and CI to publish docs automatically
Mini challenge
In 30 minutes, pick two models that reference the same business concept. Create one docs block defining that concept and reference it from both models. Add an exposure that depends on one of the models and set its owner. Regenerate and verify the docs.
Next steps
- Create a docs style guide for your team (tone, tense, required sections).
- Adopt a convention for owners and maturity levels.
- Automate docs generation in CI so your site updates on each merge to main.
Ready for the Quick Test?
Take the quick test below to check your understanding. Anyone can take it; sign in to save your progress.