Why this matters
As a Data Visualization Engineer, you ship dashboards, metrics, and data-driven apps that evolve constantly. In-repo documentation keeps the truth next to the code, versioned together. It shortens reviews, prevents broken dashboards after changes, and helps anyone reproduce or extend your work.
- Onboard teammates quickly with a clear README and project map.
- Reduce dashboard outages: document metric definitions and owners.
- Faster PR reviews: every change explains the why (ADR) and the impact (CHANGELOG).
- Reproducible builds: setup and run instructions live in the repo.
Concept explained simply
In-repo documentation means your docs live inside the same Git repository as your code and assets. They are plain text (usually Markdown) so they version, review, and deploy with your changes.
Mental model
Think of documentation as the API for humans:
- Repository-level: What this project is and how to run it (README).
- Decision log: Why you chose this approach (ADRs).
- Change surface: What changed and how it affects consumers (CHANGELOG).
- Contribution rules: How to propose changes (CONTRIBUTING, PR template).
- Asset contracts: What a metric/dataset/dashboard means and who owns it (catalog files or README in asset folders).
Minimal doc set for a BI/Visualization repo
- README.md (purpose, stack, setup, run, structure)
- docs/ (architecture notes, diagrams, runbooks)
- adr/ (lightweight Architecture Decision Records)
- CHANGELOG.md (notable user-facing changes)
- CONTRIBUTING.md + a PR checklist (docs updated?)
- dashboards/README.md (naming, ownership, refresh cadence)
- catalog/metrics.yml (metric definitions, owners, data sources)
Reliable patterns
- Keep docs close: place a README in each important folder (dashboards/, datasets/, transforms/).
- Make docs small and specific: one page per asset or topic.
- Update docs in the same PR as the code change.
- Prefer text over screenshots; for diagrams, add the source (e.g., Mermaid code) in the repo.
- Use checklists in PR templates: “Docs updated?”
Example: Mermaid diagram source in Markdown
mermaid
flowchart TD
A[Source] -- ETL --> B[Model]
B -- Metric calc --> C[Dashboard]
Worked examples
Example 1 — Metric catalog entry
You add a new metric “active_users_7d” and document it in the repo.
# catalog/metrics.yml
- name: active_users_7d
definition: Unique users who had any qualifying event in the last 7 days.
time_grain: day
owner: analytics@company
source_model: models/user_activity
filters:
- event_type in ["login", "purchase", "view"]
caveats:
- Excludes users flagged as test accounts.
last_reviewed: 2025-06-10
Value: anyone can read the exact logic and contact the owner.
Example 2 — Lightweight ADR
# adr/0003-store-dashboard-configs-as-code.md
Title: Store dashboard configs as code
Status: Accepted
Date: 2025-03-18
Context:
We currently edit dashboards in the UI only; changes are hard to review and revert.
Decision:
Represent dashboard layouts and queries as YAML in dashboards/ with code review.
Options considered:
- Keep editing in UI (easy, but not reviewable)
- Export JSON dumps (opaque diffs)
- YAML with linting (chosen)
Consequences:
+ Reviewable changes, versioning, reproducibility
- Need linting and migration script
Example 3 — README update after folder change
# README.md (excerpt)
## Project structure
- dashboards/ — YAML specs for dashboards (owners in dashboards/README.md)
- catalog/ — metrics.yml and dataset descriptions
- scripts/ — CLI tools: `scripts/build_dashboards.sh`
- adr/ — decision records
## Run
- Install: `pip install -r requirements.txt`
- Build dashboards: `bash scripts/build_dashboards.sh`
How to implement (step-by-step)
- Add or refresh README.md with purpose, setup, run, structure.
- Create folders: docs/, adr/, catalog/, dashboards/.
- Add a PR template with a “Docs updated” checkbox.
- Start a CHANGELOG.md and add an “Unreleased” section.
- Document each metric/dashboard when you create or modify it.
Reusable templates
---
# README template
## What is this?
## Why it exists
## Stack
## Setup
## How to run
## Structure
## Troubleshooting
---
# ADR template
Title:
Status: Proposed | Accepted | Deprecated
Date:
Context:
Decision:
Options considered:
Consequences:
---
# CHANGELOG headers
## [Unreleased]
### Added
### Changed
### Deprecated
### Removed
### Fixed
### Security
Exercises
Do these in a scratch repo or a new branch. Keep each answer as a Markdown file.
- [ ] Exercise 1: Write a repo-level README skeleton.
- [ ] Exercise 2: Create an ADR for a documentation-impacting decision.
- [ ] Exercise 3: Add a CHANGELOG entry for a breaking metric rename and a PR checklist.
Exercise 1 — Repo README skeleton
Create README.md that covers: What/Why, Stack, Setup, Run, Structure, Troubleshooting. Keep it under 200 lines.
Hints
- Imagine a teammate with no context cloning your repo.
- Include one example command to run a build or export.
Exercise 2 — ADR for decision
Write an ADR proposing to store dashboard definitions as YAML (or another decision relevant to your project). Include Context, Decision, Options, Consequences.
Hints
- One paragraph per section is enough.
- Call out at least one drawback.
Exercise 3 — CHANGELOG + PR checklist
Write a CHANGELOG entry for renaming metric “orders” to “orders_completed” and deprecating the old one next release. Draft a PR checklist with a “Docs updated” item.
Hints
- Use a section like [Unreleased] with Added/Changed/Deprecated.
- State impact on dashboards.
Common mistakes and self-check
- Huge README with history. Fix: keep the why in ADRs, the what/how in README.
- Docs separate from code (wikis only). Fix: place authoritative docs in-repo; wikis can link or mirror but repo is the source.
- Updating code without docs. Fix: PR checklist and code review rule: “no doc, no merge.”
- Binary-only diagrams. Fix: keep the editable source (Mermaid, draw.io XML) in the repo.
- Ambiguous metric definitions. Fix: include owner, filters, time grain, caveats.
Self-check prompts
- Can a new teammate run the project in under 10 minutes using README?
- Is every new metric documented next to its model or in the catalog?
- Does each non-trivial change have an ADR or a rationale in the PR?
- Can you list breaking changes from CHANGELOG in 1 minute?
Mini challenge
Open a docs-only PR that improves clarity without changing code. Examples: add owners to all dashboards, add a Troubleshooting section, or split a bloated README into smaller folder-level READMEs.
Who this is for
- Data Visualization Engineers shipping dashboards and visual analytics.
- Analytics Engineers supporting BI repos and data products.
- Anyone who maintains metrics used by stakeholders.
Prerequisites
- Basic Git: branches, commits, pull requests.
- Markdown basics (headings, lists, code blocks).
- Familiarity with your project’s stack (BI tool, scripting language).
Learning path
- Create minimal repo docs (README, CHANGELOG).
- Adopt ADRs for meaningful decisions.
- Document assets (metrics.yml, dashboards/README).
- Add PR checklist to enforce doc updates.
- Automate simple checks (presence of docs files) later if needed.
Practical projects
- Turn an existing dashboard into a code-tracked asset with a YAML spec and owner data.
- Write ADRs for two past decisions you recall but never recorded.
- Refactor docs: split one large README into smaller, folder-level READMEs.
Next steps
- Make “docs updated” a standard PR checkbox.
- Schedule a quarterly 1-hour “doc hygiene” review.
- Introduce a simple template for new metrics and dashboards.
Quick test
Anyone can take the test for free. Sign in to save your progress and see it on your learning path.