Why this matters
As a BI Analyst, you turn questions into reliable dashboards and insights. Mapping data sources and ownership ensures you know where each metric comes from, who is responsible for its quality, and how changes will be managed. This prevents broken dashboards, wrong KPIs, and surprise outages.
- Real tasks you will do: identify systems feeding a report, confirm the official owner for each dataset, capture refresh schedules and SLAs, document lineage from raw to report, and escalate issues to the right steward.
- Outcome: a clear register that shows what data is used, who owns it, how trustworthy it is, and what to do when something changes.
Concept explained simply
Mapping data sources and ownership is making a living inventory of all inputs to your report, plus the people accountable for each piece. You record the system name, table or field, refresh timing, quality notes, and the person/role responsible for accuracy and availability.
Mental model
Think of a dinner menu. Each dish (metric) depends on ingredients (data sources) delivered by specific suppliers (owners). If an ingredient is late or wrong, you know exactly which supplier to call and what to substitute. Your data source map is the kitchen board that tracks all of this.
Core steps to map data sources and ownership
- List business questions and metrics.
What are we answering? e.g., Weekly Sales, Active Users, Churn Rate. - Identify candidate sources.
Systems, feeds, tables, APIs, files. Capture system name and data type. - Trace lineage.
From source to staging to transformations to final report fields. Note ETL/ELT jobs and schedules. - Assign ownership.
Business owner (accountable for definition), technical owner (system/data pipeline), and data steward (quality). When unclear, propose a RACI and get sign-off. - Record refresh and SLAs.
Frequency, latency, expected delivery window, and what happens if late. - Classify sensitivity and compliance.
PII/PHI, contractual data, masking rules, retention. - Validate with stakeholders.
Walk through the register, confirm definitions, and agree on escalation paths.
Minimal template you can copy
Use these fields for each source/field:
- Use case / Metric
- System / Source name
- Object (table/view/file) and key fields
- Lineage (source → staging → model → report)
- Refresh cadence & window (e.g., hourly, 15:00–15:15 UTC)
- Data quality notes (known gaps, null rules, duplicates)
- Business owner (role + contact)
- Technical owner (role + contact)
- Data steward (role + contact)
- Sensitivity (Public/Internal/Confidential/Restricted)
- SLA and escalation path
- Change control (who approves schema/logic changes)
Worked examples
Example 1: Sales Dashboard (Daily Orders)
- Metric: Total Orders Yesterday
- Sources: eCommerce DB (orders table), Payments API (settlements)
- Lineage: ecommerce.orders → DWH.stg_orders → DWH.fact_orders → BI metric
- Ownership: Business owner: Head of Sales Ops; Technical owner: Data Engineering Team; Steward: Data Governance
- Refresh: Daily at 03:00 UTC, SLA by 03:30 UTC
- Notes: Orders cancelled within 24h excluded from metric
Example 2: Customer 360 (Active Subscribers)
- Metric: Active Subscribers (last 30 days)
- Sources: CRM (accounts), Subscription Billing (subscriptions), Product Events (logins)
- Lineage: CRM.accounts + Billing.subscriptions + Events.login → stg_ tables → dim_customer, fact_subscription → BI metric
- Ownership: Business owner: Director of Customer Success; Technical owner: Data Platform Team; Steward: Subscription Ops
- Refresh: CRM hourly; Billing daily; Events near real-time
- Notes: Free trials excluded; definition approved in data contract v1.2
Example 3: Finance Close (Monthly Revenue by Region)
- Metric: GAAP Revenue
- Sources: ERP (GL & AR), Data warehouse model (fact_revenue)
- Lineage: ERP.GL, ERP.AR → stg_gl, stg_ar → fact_revenue → BI P&L
- Ownership: Business owner: Controller; Technical owner: ERP Admin + DE Finance Squad; Steward: Finance Data Steward
- Refresh: Daily, with month-end freeze window
- Notes: Adjustments posted after close appear in next period’s restatement
Quality and risk checks
- Every metric traces to a source object and transformation step
- A named business owner and technical owner exist for each source
- Refresh cadence and SLA are explicit and realistic
- Sensitivity is classified and masking rules (if any) are noted
- Known data quality issues and null handling rules are documented
- Escalation path is clear (who to ping when data is late/wrong)
How to self-check your map
- Pick one metric and follow it backwards to raw data. If you hit a gap, fix the register.
- Ask the named owner to confirm: “Is this your data? Are you accountable for accuracy?”
- Trigger a hypothetical incident: “If today’s load fails, what do we do?”
Exercises
These mirror the tasks below in the Exercises panel. Complete them here first, then submit in the panel for feedback. The quick test is available to everyone; only logged-in users get saved progress.
Exercise 1: Build a source-ownership register for Weekly Sales Snapshot
- Identify sources: CRM (accounts), eCommerce (orders), Payments (settlements)
- Define owners: business, technical, steward
- Add lineage, refresh, SLA, sensitivity
- Describe escalation path for late data
Need a nudge?
Start by listing the metric: “Weekly Sales.” Determine which systems produce orders and payment confirmations. Who defines “sale”? Who runs the pipeline?
Exercise 2: Map lineage for Monthly Revenue by Region
- List end metric and target BI tiles
- Map: ERP → staging → transformations → fact table → BI
- Assign owners and stewards
- State close window rules and SLA
Tip
Finance processes often freeze data. Note when changes are allowed and who approves them.
Common mistakes and how to avoid them
- No named owner: A team name is not enough. Record a role with contact. If unclear, propose interim ownership and get sign-off.
- Ignoring refresh windows: “Daily” is vague. Specify time and timezone.
- Skipping sensitivity: Mark PII/Restricted data and note masking.
- Vague definitions: Write the metric rule (e.g., exclude cancellations within 24h) to avoid disputes.
- Not updating the register: Treat it as a living artifact. Reconfirm on changes or incidents.
Self-check prompt
For each metric, can you answer: Where does it come from? Who owns it? When is it refreshed? What happens if it breaks? If any answer is missing, your map isn’t ready.
Practical projects
- Source Register MVP: Build a simple spreadsheet or doc for one dashboard with 5–10 metrics. Include owners, refresh, and SLAs.
- Lineage Diagram: Draw a left-to-right flow from sources to BI for two metrics. Annotate refresh times and responsibilities.
- Data Contract Draft: For one critical table, define fields, allowed changes, refresh SLA, and approval roles.
Who this is for and prerequisites
- Who this is for: Aspiring or current BI Analysts, Analytics Engineers, and Product Analysts building or maintaining dashboards and reports.
- Prerequisites: Basic SQL, understanding of ETL/ELT concepts, familiarity with metrics and dimensions.
Learning path
- Clarify business questions and metric definitions.
- Map data sources and owners (this lesson).
- Define SLAs and escalation paths.
- Document lineage and quality checks.
- Validate with stakeholders and iterate.
Next steps
- Complete the exercises and compare with the solutions.
- Take the Quick Test to confirm your understanding.
- Apply the template to one live report in your environment.
Mini challenge
Pick a KPI you report today. In 10 minutes, write its source systems, one business owner, one technical owner, refresh time, and a single known data caveat. If you can’t list these confidently, you’ve found your first improvement.