luvv to helpDiscover the Best Free Online Tools
Topic 9 of 10

Documentation Generation In dbt

Learn Documentation Generation In dbt for free with explanations, exercises, and a quick test (for Analytics Engineer).

Published: December 23, 2025 | Updated: December 23, 2025

Who this is for

This lesson is for Analytics Engineers, BI Developers, and Data Analysts who use dbt and want clear, trustworthy project documentation that auto-updates with the codebase.

Prerequisites

  • Basic dbt project set up (profiles configured, you can run or compile models)
  • Comfort with YAML and simple Jinja
  • Ability to run terminal commands

Why this matters

In real teams, you will:

  • Onboard teammates quickly with a browsable catalog of models, sources, and lineage
  • Reduce tribal knowledge by embedding definitions and owners next to the code
  • Support governance and quality by documenting tests, PII tags, and maturity
  • Perform impact analysis using lineage before you change or deprecate a model

Concept explained simply

dbt can generate a static documentation site from your project. It reads your project graph (models, sources, tests, macros, exposures), plus database metadata (like columns and types), and renders a searchable site with lineage.

There are two main documentation inputs you control:

  • Descriptions in YAML: Add descriptions to models, columns, sources, tests.
  • Docs blocks: Reusable long-form text snippets you reference from YAML using {{ doc('block_name') }}.

Plus, you can document exposures (dashboards, ML models, reports) and set owners, maturity, tags, and meta to enrich the docs.

Mental model

  • Think of each dbt node as a "recipe card". Descriptions explain what it produces and how.
  • Docs blocks are "reusable paragraphs" you can reference across multiple models.
  • Exposures are "windows to the outside world" that show how your models power dashboards and apps.
  • Generate = "bake the site". Serve = "open the site in a local viewer".

Core pieces you will use

1) Describe models and columns (schema.yml)
version: 2
models:
  - name: orders
    description: 'Orders at the order_id grain. Includes financial fields.'
    columns:
      - name: order_id
        description: 'Primary key'
      - name: total_amount
        description: 'Gross order amount in USD before discounts'
2) Create a docs block and reference it

Create a file like docs/model_guidelines.md with:

{% docs financial_fields %}
Use these rules for financial fields:
- Values are in USD
- Negative values indicate refunds or chargebacks
- Nulls mean 'not applicable'
{% enddocs %}

Then reference it in YAML:

description: "{{ doc('financial_fields') }}"
3) Document sources
version: 2
sources:
  - name: stripe
    schema: raw
    description: 'Raw Stripe tables'
    tables:
      - name: charges
        description: 'Stripe charge-level records'
        columns:
          - name: id
            description: 'Stripe charge id (primary key)'
4) Document exposures (dashboards, reports, ML)
version: 2
exposures:
  - name: finance_kpis
    type: dashboard
    maturity: high
    owner:
      name: Finance Analytics
      email: finance-analytics@example.com
    depends_on:
      - ref('mart_finance_kpis')
    description: 'Executive KPIs powered by mart_finance_kpis'
5) Generate and preview the docs site
# Build static files (manifest.json, catalog.json, index.html)
dbt docs generate

# Preview locally (prints a local URL)
dbt docs serve

Worked examples

Example 1 — Add model and column descriptions

Goal: Document a model and key columns so teammates know what to trust.

version: 2
models:
  - name: stg_orders
    description: 'Staging model cleaning raw orders. One row per order.'
    columns:
      - name: order_id
        description: 'Surrogate key of the order'
      - name: customer_id
        description: 'Business key referencing customers'
      - name: order_status
        description: 'Created/Paid/Refunded etc.'

Then:

dbt docs generate
dbt docs serve

Result: In the docs site, stg_orders shows your descriptions, its columns, and lineage.

Example 2 — Reuse a docs block across models

Goal: Avoid duplicating a long definition for 'payment_status'.

{% docs payment_status_def %}
Payment status logic:
- 'paid' = settled and captured
- 'pending' = authorized but not captured
- 'refunded' = refunded after capture
{% enddocs %}

Reference it from YAML in two models:

version: 2
models:
  - name: fct_payments
    columns:
      - name: payment_status
        description: "{{ doc('payment_status_def') }}"
  - name: fct_orders
    columns:
      - name: payment_status
        description: "{{ doc('payment_status_def') }}"

Result: Both models display the same authoritative definition.

Example 3 — Sources and an exposure with owner

Goal: Document where data comes from and who owns a BI asset.

version: 2
sources:
  - name: app
    schema: raw
    description: 'Application DB replica'
    tables:
      - name: users
        description: 'App users table'
        columns:
          - name: id
            description: 'User id'
          - name: email
            description: 'User email (sensitive)'

exposures:
  - name: user_growth_dashboard
    type: dashboard
    maturity: medium
    owner:
      name: Growth Team
      email: growth@example.com
    depends_on:
      - ref('mart_user_growth_daily')
    description: 'Tracks signups, activations, and retention.'

Result: The docs site shows the source tables, the dashboard exposure, lineage, and owner contact.

Step-by-step: generate docs locally

Step 1: Add or improve descriptions in your schema.yml files (models, sources, columns).
Step 2: Create optional docs blocks for reusable definitions, e.g., docs/definitions.md.
{% docs dimension_table %}
A dimension table contains descriptive attributes used for filtering and grouping facts.
{% enddocs %}
Step 3: Reference docs blocks in YAML where needed.
description: "{{ doc('dimension_table') }}"
Step 4: Compile/build your project or go straight to generating docs.
dbt docs generate

This builds the static site files (including catalog and manifest artifacts).

Step 5: Preview the site locally.
dbt docs serve

Open the printed local address to browse the docs, search, and inspect lineage.

Exercises

You can complete these locally. Everyone can take the quick test; only logged-in users will have progress saved.

  • [ ] Exercise 1: Add model and column descriptions, generate docs, and verify they appear.
  • [ ] Exercise 2: Create and reuse a docs block across two models.

Common mistakes and self-check

Mistake: Using doc() without defining a docs block

Fix: Create a docs block with the exact same name inside a .md file in your project.

Mistake: YAML indentation or quoting errors

Fix: Use 2-space indentation. Quote Jinja strings correctly, e.g., description: "{{ doc('name') }}".

Mistake: Expecting docs to show fresh schema after database changes without regenerating

Fix: Run dbt docs generate again to refresh catalog and manifest.

Self-check tips
  • Search the docs site for a term you added; confirm it appears exactly once or as many times as expected.
  • Click a model and verify column descriptions and data types.
  • Open the lineage view to confirm upstream sources and downstream exposures.

Practical projects

  • Document a small e-commerce project: sources (payments, orders), staging, marts, and one finance dashboard exposure.
  • Create a reusable glossary with docs blocks for key business terms (customer, churn, active) and reference them in multiple models.
  • Add owners, maturity, and PII notes via descriptions/meta for sensitive columns; ensure they are visible in the docs site.

Learning path

  • Before: dbt project structure, models, and tests
  • Now: Documentation generation (this subskill)
  • Next: Exposures, metadata conventions, and CI to publish docs automatically

Mini challenge

In 30 minutes, pick two models that reference the same business concept. Create one docs block defining that concept and reference it from both models. Add an exposure that depends on one of the models and set its owner. Regenerate and verify the docs.

Next steps

  • Create a docs style guide for your team (tone, tense, required sections).
  • Adopt a convention for owners and maturity levels.
  • Automate docs generation in CI so your site updates on each merge to main.

Ready for the Quick Test?

Take the quick test below to check your understanding. Anyone can take it; sign in to save your progress.

Practice Exercises

2 exercises to complete

Instructions

  1. Create or open a schema.yml beside one of your models (e.g., models/staging/schema.yml).
  2. Add a model entry with a clear description and at least three column descriptions.
  3. Run dbt docs generate and then dbt docs serve.
  4. Open the local docs and verify your text appears for the model and each column.
version: 2
models:
  - name: stg_customers
    description: 'Cleaned customers at one row per customer_id.'
    columns:
      - name: customer_id
        description: 'Primary key for customers'
      - name: email
        description: 'Customer email (lowercased)'
      - name: created_at
        description: 'UTC timestamp when the customer record was created'
Expected Output
Docs site shows model 'stg_customers' with your model description and three column descriptions visible on the model page.

Documentation Generation In dbt — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Documentation Generation In dbt?

AI Assistant

Ask questions about this tool