Why this matters
As a BI Analyst, you build dashboards, reports, and models that drive decisions. If your audience doesn’t understand known data limits (missing fields, delays, sampling, or excluded sources), they can misread trends, overreact, or distrust analytics. Documenting known data limitations protects decisions and your credibility.
- Real tasks: add limitation notes on dashboards, maintain a data dictionary, flag temporary outages, describe exclusion rules (e.g., test orders).
- Typical cases: reporting delays (D+1), sampling in web analytics, partial coverage during a migration, deduping uncertainty, optional fields causing bias.
Who this is for
- BI Analysts and Analytics Engineers who publish metrics to stakeholders.
- Product/Data Analysts who explain KPIs to non-technical teams.
- Anyone maintaining shared datasets or dashboards.
Prerequisites
- Basic understanding of your data sources and refresh schedules.
- Familiarity with metric definitions (how KPIs are calculated).
- Ability to read pipeline/job runs or report refresh logs.
Concept explained simply
A known data limitation is a documented, current constraint that could change how a metric is interpreted. You may not fix it today, but you make it visible, quantify its impact, and guide how to use affected data.
- Documenting vs. fixing: documentation is an immediate safeguard; fixing is a project.
- Where it lives: dashboard notes, data dictionary entries, dataset README, or a central "limitations register."
- Goal: ensure every data consumer understands scope, impact, and safe use.
Mental model
Think of limitations as product warning labels. They don’t stop use; they prevent misuse. Use a consistent label format so anyone can quickly parse: what, where, impact, severity, and workarounds.
Simple limitation template
- ID: L-YYYY-NN
- Title: Short, specific (e.g., "Revenue excludes App Store iOS data")
- Scope: Dataset(s)/Dashboard(s) impacted
- Impact: What changes in interpretation (quantify if possible)
- Severity: High | Medium | Low
- Timeframe: Since/Until (or ongoing); refresh cadence
- Mitigation/Workaround: How to use data safely
- Owner & Review date: Who keeps this current and when it will be checked
Why each field matters
- ID: lets you reference the same issue across assets.
- Scope: avoids over- or under-applying the warning.
- Impact: translates tech details into decision risk.
- Severity: sets urgency and visibility.
- Timeframe: lets stakeholders know if/when it may change.
- Mitigation: keeps analytics useful despite limits.
- Owner/Review: prevents stale warnings.
Worked examples
Example 1 — Session sampling in web analytics
ID: L-2025-01
Title: GA4 sessions sampled above 10M events
Scope: Web Traffic Dashboard, GA4 Sessions metric
Impact: High-traffic days may undercount sessions by ~3–7%. Trend direction still reliable; day-to-day deltas are noisy.
Severity: Medium
Timeframe: Ongoing for ad-heavy campaigns; review quarterly
Mitigation: Use weekly rollups for decisions; avoid single-day comparisons. For precise counts, request BigQuery export extract.
Owner/Review: Web Analytics, next review 2025-03-31
Example 2 — Revenue excludes a channel
ID: L-2025-02
Title: Revenue excludes iOS in-app purchases
Scope: Revenue dashboard, Total Revenue KPI
Impact: Understates total revenue by ~12–15% on mobile segments; growth rate within web remains valid.
Severity: High
Timeframe: Since 2024-12-10; expected fix ETA 2025-02-15
Mitigation: Use Web-only revenue split for channel goals. Avoid cross-platform totals until fix.
Owner/Review: Finance Data, review weekly until resolved
Example 3 — Optional customer attributes
ID: L-2025-03
Title: Missing customer industry on free-plan signups
Scope: B2B Segmentation dashboard
Impact: Industry-based conversion rates skew toward enterprises; SMBs underrepresented by ~20% among free-plan users.
Severity: Medium
Timeframe: Ongoing
Mitigation: Use only paid-plan data for industry comparisons; label free-plan as "Unknown" instead of dropping rows.
Owner/Review: Growth Analytics, review quarterly
How to write clear limitation notes
- Inventory sources: list datasets, refresh times, known gaps.
- Classify: map each issue to High/Medium/Low based on decision impact.
- Quantify: estimate effect size (%, days of delay, segments excluded).
- Place notes: add to dashboard sections and the central register.
- Set ownership: who updates, and a review date.
Mini task — Classify three issues
- Daily sales refresh at 06:00 UTC (decisions made at 10:00) — Medium
- Marketing UTM parsing drops unknown mediums — High if budget allocation relies on it
- Minor spelling variants in product name field — Low if only used for search
Checklist before you publish
- Title is specific and short.
- Impact explains decision risk in plain language.
- Numbers are estimated where possible.
- Severity is assigned and justified.
- Scope lists exact dashboards/datasets affected.
- Owner and review date are set.
- Note appears where users see the metric (not just in a separate doc).
Common mistakes and self-check
- Vague language ("may be inaccurate"). Instead: specify how and by how much.
- No timeframe. Add since/until or review date.
- No owner. Assign a team/person to keep it current.
- Burying notes. Display near the metric and keep a central register.
- Mixing hypotheses with facts. Label assumptions as "hypothesis" until validated.
- Not updating after fixes. Close the note with resolved date and version.
Exercises
Do these in order. The quick test is available to everyone; if you log in, your progress is saved.
Exercise 1 (ex1) — Write a limitation entry
Scenario: Orders placed after 22:00 local time appear in reports the next day due to ETL timing. Sales leaders view daily dashboards at 09:00.
Task: Use the template to write a clear limitation note. Include an impact estimate and mitigation.
Exercise 2 (ex2) — Improve the language
Draft note: "Leads data is unreliable sometimes due to CRM issues."
Task: Rewrite this to be decision-ready. Add scope, quantified impact, timeframe, and owner.
Practical projects
- Create a limitations register for your top 5 dashboards. Assign IDs, owners, and review dates.
- Add inline limitation banners to two KPIs (title, impact, mitigation). Ask two stakeholders if the note changed their interpretation.
- Quantify one limitation using a backfill or side-by-side comparison (e.g., sampled vs. unsampled). Update the impact field with numbers.
Learning path
- Before this: Data profiling and freshness checks.
- Now: Documenting Known Data Limitations (this lesson).
- Next: Data lineage basics, anomaly detection alerts, and validation tests to prevent regressions.
Mini challenge
Pick one high-visibility KPI. Add a limitation note that includes scope, impact with a number, and a mitigation. Share with a stakeholder and ask: "What would you do differently knowing this?" Revise the note based on their feedback.
Next steps
- Publish at least one limitation note today.
- Set calendar reminders for review dates.
- Take the quick test below to check your understanding.