Why this matters
Clear model descriptions and annotations make your BI models usable, trustworthy, and maintainable. As a BI Developer, you will:
- Explain what each dataset, table, column, and measure means in business terms.
- Reduce report confusion by defining units, time ranges, and calculation logic.
- Speed up onboarding for analysts and stakeholders with self-serve documentation.
- Support governance: ownership, sensitivity, data sources, refresh cadence.
Real tasks you will handle
- Add business-friendly descriptions to a sales model so sales, finance, and product teams use the same definitions.
- Annotate sensitive fields (e.g., emails) with classification and masking guidance.
- Record assumptions when changing a metric’s calculation (e.g., excluding refunds).
- Tag deprecated fields to steer consumers to the correct ones.
Concept explained simply
Descriptions are clear sentences that explain what a thing is. Annotations are structured labels (key-value metadata) about that thing.
- Description example: "Gross revenue before discounts in USD, daily grain."
- Annotation examples: owner=jane@org, sensitivity=confidential, currency=USD, refresh=hourly, status=deprecated.
Mental model
Think of your semantic model as a blueprint. Descriptions are sticky notes that explain each room’s purpose. Annotations are the labels on those notes that make them searchable, comparable, and enforceable by standards.
What to document
- Dataset: purpose, audience, data sources, refresh schedule, owner/support, SLA, known caveats.
- Table: business entity, keys, grain, filters applied, lineage.
- Column: business meaning, unit/type, allowed values, null handling, sensitivity, examples.
- Measure: formula intent, inclusion/exclusion rules, time grain, currency, rounding, filters.
- Relationships: join keys, cardinality, filters (bi-directional?), and why.
- Lifecycle: status (active/deprecated), version, change log date.
Worked examples
Example 1 — Semantic model (YAML-like)
model: sales_mart
description: |
Core sales dataset for executive dashboards and self-serve analysis.
Daily grain. Includes orders from the e-commerce platform only.
annotations:
owner: data.bi@company.com
refresh_cadence: daily_6am
domain: revenue
status: active
tables:
- name: orders
description: One row per order_id captured from the checkout service.
annotations:
grain: order
columns:
- name: order_id
description: Primary key of the order.
annotations:
pii: false
example: 8123912
- name: customer_email
description: Email of the purchasing customer.
annotations:
pii: true
sensitivity: confidential
- name: order_total_usd
description: Order total after discounts, before tax and shipping (USD).
annotations:
unit: USD
rounding: cents
- name: calendar
description: Date dimension for standard time intelligence.
columns:
- name: date
description: Calendar date (YYYY-MM-DD).
Example 2 — Measure with annotations
measure: Revenue
description: Sum of order_total_usd excluding refunds; reported in USD; daily grain.
formula: SUM(orders.order_total_usd) - SUM(refunds.refund_amount_usd)
annotations:
currency: USD
includes_refunds: false
time_grain: day
rounding: dollars
owner: finance.analytics@company.com
Example 3 — Relationship and governance signals
relationship:
from: orders.customer_id
to: customers.customer_id
description: Links orders to customers for LTV and cohort analysis.
annotations:
cardinality: many_to_one
filter_direction: single
join_quality: high
column: customers.email
description: Customer contact email; used for notifications and receipts.
annotations:
sensitivity: confidential
retention_policy: 24_months
masking: hash_in_nonprod
How to add descriptions quickly
- Inventory: Export a list of datasets, tables, columns, and measures. Mark blanks and outdated descriptions.
- Use templates: Apply short, consistent templates (see below). Aim for 1–3 sentences max.
- Batch edit: Start with high-impact entities (top-used measures and columns). Add owners and refresh details first.
- Annotate sensitivity: Tag PII and confidential fields. Note masking/retention.
- Review with SMEs: Validate business meaning and caveats with Finance/Sales owners.
- Publish & version: Save in the model metadata so it travels with the code and can be version-controlled.
Templates and checklists
Templates
Dataset description template:
Purpose: [who/what/why]
Grain: [row grain]
Sources: [systems]
Refresh: [cadence/timezone]
Owner: [team/email]
Caveats: [known gaps]
Table description template:
Entity: [what the row represents]
Keys: [primary keys]
Filters: [included/excluded]
Usage: [typical analyses]
Column description template:
Meaning: [business definition]
Unit/Type: [USD, %, enum]
Nulls: [allowed/handling]
Sensitivity: [public/internal/confidential]
Example: [sample value]
Measure description template:
Definition: [inclusion/exclusion]
Time grain: [day/week/month]
Currency/Unit: [USD, %]
Assumptions: [important caveats]
Pre-publish checklist
- Plain language, no jargon or SQL-only terms.
- Units and time grain specified.
- Owner and refresh cadence filled.
- Sensitivity tags applied where needed.
- Deprecated fields clearly tagged.
- Examples included for tricky fields.
Exercises
Complete these before the quick test. Progress is saved for logged-in users; the test is available to everyone.
Exercise 1 — Describe a core sales model
Goal: Fill in descriptions and annotations so a non-technical stakeholder can use the model confidently.
Do this:
- Using the snippet below, add missing descriptions and annotations for the dataset, table, and fields.
- Include owner, refresh cadence, currency units, and any caveats.
model: sales_core
description: |
[ADD PURPOSE]
annotations:
owner: [TEAM@EMAIL]
refresh_cadence: [CADENCE]
tables:
- name: orders
description: [ADD]
columns:
- name: order_id
description: [ADD]
- name: order_total
description: [ADD]
annotations:
unit: [ADD]
includes_tax: [true/false]
- name: customer_email
description: [ADD]
annotations:
sensitivity: [ADD]
Checklist:
- Purpose, grain, and audience stated.
- Units and tax/discount assumptions clarified.
- Owner email and refresh cadence included.
- Sensitivity marked on PII.
Exercise 2 — Annotate a metric
Goal: Remove ambiguity from a revenue measure.
Do this:
- Create a measure "Revenue" with description and annotations.
- Declare: currency=USD, includes_refunds=false, time_grain=month.
- Add a one-sentence business description.
measure: Revenue
description: [ADD]
formula: [PSEUDO: SUM(orders.order_total) - SUM(refunds.amount)]
annotations:
currency: [ADD]
includes_refunds: [ADD]
time_grain: [ADD]
Checklist:
- Clear inclusion/exclusion rules.
- Currency and time grain set.
- Plain language for business users.
Common mistakes and self-check
- Missing units or time grain — Add unit (USD, %) and grain (day/week/month) in the description or annotations.
- Technical jargon — Replace SQL terminology with business language; give an example value.
- Hidden assumptions — Write explicit inclusion/exclusion rules (tax, refunds, discounts).
- No owner/refresh info — Add owner email and refresh cadence to help support and trust.
- Unlabeled sensitive fields — Tag sensitivity and masking/retention guidance.
- Outdated docs — Update description and version/date when calculations change.
Self-check prompts
- Could a new analyst pick the correct measure without asking you?
- Would an exec understand the caveats in one read?
- Can you search/filter by tags (e.g., domain, sensitivity)?
Practical projects
- Project A: Document the top 20 most-used measures in your main BI model with owners, units, time grain, and assumptions.
- Project B: Tag all PII columns and add masking/retention notes. Create a "sensitive=true" filterable view.
- Project C: Add a change log annotation to 5 critical metrics and perform a peer review with Finance.
Mini challenge
In 3 sentences, write a description for a "Net Revenue" measure that excludes refunds and discounts, uses USD, and is reported monthly. Add annotations for currency, includes_refunds=false, includes_discounts=false, and time_grain=month.
Who this is for
- BI Developers and Analytics Engineers who own semantic models and dashboards.
- Analysts who maintain business definitions and metrics.
Prerequisites
- Basic understanding of your BI/semantic modeling tool.
- Comfort reading model objects: datasets, tables, columns, measures, relationships.
Learning path
- Start: Model Descriptions and Annotations (this lesson).
- Next: Data lineage and ownership tagging.
- Then: Documentation automation and style guides.
Next steps
- Finish the exercises and run the quick test below.
- Apply templates to one of your active models this week.
- Schedule a 15-minute review with a domain SME to validate definitions.
Quick test is available to everyone; only logged-in users get saved progress.