Why this matters
As a BI Developer, you translate business questions into reliable dashboards and data products. Getting the refresh cadence (how often data updates) and latency (how quickly new data appears) right prevents wrong decisions, broken trust, and wasted compute. Real tasks include setting daily 7:00 AM executive refreshes, configuring 15-minute intraday sales updates, and defining end-of-month financial lock windows.
Concept explained simply
Refresh cadence is the schedule for when data updates (every 5 minutes, hourly at :15, daily at 06:00, etc.). Latency is the time between a real-world event and when it’s visible in your report (e.g., a sale made at 10:03 shows up by 10:10, so ~7 minutes latency).
Mental model
Think in two clocks:
- Business clock: When do people make decisions? (shift start, market open, board meeting, nightly ops)
- Data clock: When does source data arrive and stabilize? (streaming events, hourly batches, vendor APIs with delays)
Your job is to align these clocks with a clear Service Level Objective (SLO): “By 07:00 daily, dashboard is refreshed with data complete through 23:59 prior day; max latency 60 minutes.”
Key definitions and metrics
- Refresh cadence: The update schedule (cron or event-driven).
- Latency: Time from event occurrence to availability in the BI layer.
- Freshness (max age): Now minus the latest data timestamp shown.
- Completeness: Expected coverage (e.g., 99% of orders within 2 hours).
- Window: Time period covered (e.g., “complete through previous day”).
- Blackout/lock window: Periods when data should not refresh (e.g., finance close).
- SLA/SLO: What you commit to stakeholders (SLO is your target; SLA is a formal promise if used).
Decision framework (5 steps)
- Identify decision moments: When exactly do users act? Capture deadlines (e.g., 06:30 warehouse staffing).
- Define tolerances: Maximum acceptable latency and data staleness. Clarify if approximate early data is OK.
- Map source realities: Ingestion type (stream, micro-batch, nightly), vendor API delays/restatements, dependencies.
- Choose cadence and safeguards: Cron schedule or event trigger, retry rules, freshness checks, alerts, and blackout windows.
- Write a clear SLO: One or two sentences anyone can read and verify.
Example SLO template
“The {Report} refreshes {Cadence} at {Times}. Data is complete through {Window}. New events appear within {Latency}. We alert if freshness exceeds {Threshold}. No refresh between {Blackout}.”
Worked examples
1) Executive KPIs (daily)
Need: CEO reviews at 07:30. Prior day must be complete; partial data is harmful.
- Source: ERP closes prior day by 02:00.
- Cadence: Daily at 06:45.
- Latency: Up to 4–5 hours overnight is fine; accuracy over speed.
- SLO: “By 06:45 daily, KPIs include data through 23:59 prior day; max allowed data age 8h. Alert if freshness > 10h.”
- Blackout: No refresh 07:15–08:00 to avoid mid-meeting flips.
2) Operations orders (intraday)
Need: Floor managers rebalance every 30 minutes; near real-time helpful.
- Source: CDC stream with 2–5 minute lag; occasional spikes to 10 minutes.
- Cadence: Every 15 minutes at :00, :15, :30, :45.
- Latency: New orders visible within 10 minutes under normal load.
- SLO: “Quarter-hourly refresh; 95% of new orders appear within 10 minutes; alert if latency > 15 minutes for two consecutive runs.”
3) Paid media spend (vendor API)
Need: Marketers adjust bids twice daily; vendor restates data up to 72 hours later.
- Source: Ads APIs with 3–12 hour delays; restatements common.
- Cadence: 4x per day (06:00, 12:00, 18:00, 00:00).
- Latency: Accept up to 12 hours for fresh spend; nightly backfill handles restatements.
- SLO: “Spend updates every 6 hours; restatements backfilled nightly up to 3 days; indicate ‘preliminary’ badge for data < 24h old.”
How to measure and monitor
Compute freshness and latency using source timestamps and load times.
Simple SQL freshness check
-- Example fields: event_ts (UTC), loaded_at (UTC)
SELECT
NOW() AS now_utc,
MAX(event_ts) AS latest_event,
EXTRACT(EPOCH FROM (NOW() - MAX(event_ts))) / 60 AS freshness_minutes
FROM analytics.fact_orders;
Latency distribution (recent 24h)
SELECT
DATE_TRUNC('hour', event_ts) AS hour,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (loaded_at - event_ts))) AS p95_seconds
FROM staging.orders
WHERE event_ts > NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;
- Alert rule: If freshness_minutes > target by 20%, page the data on-call or notify channel.
- Dashboard badge: “Updated at HH:MM; data complete through YYYY-MM-DD.”
Exercises
Note: The quick test is available to everyone. Only logged-in users will have their test progress saved.
Exercise 1: Logistics morning ops (mirrors Exercise ex1)
Task: Propose cadence, latency, and SLO for a warehouse operations dashboard.
- Decision moment: Shift briefing at 06:15 daily.
- Sources: WMS posts final prior-day counts by 02:30; inbound shipments stream with ~5–8 min lag.
- Tolerance: Prior-day must be final; intraday inbound can be near real-time.
- Constraints: ETL jobs take ~12 minutes; avoid refresh during 06:10–06:20.
Expected output: A short SLO, a schedule (cron-like description), and alert conditions.
Show example solution
SLO: “By 06:05 daily, dashboard includes finalized prior-day counts. Inbound shipments appear within 10 minutes during operating hours. No refresh 06:10–06:20.”
Cadence: Daily job at 05:50 for prior-day; intraday micro-batch every 10 minutes 07:00–22:00.
Latency: Target <= 10 minutes for inbound; alert if freshness > 20 minutes for two consecutive cycles.
Exercise 2: Retail promo performance (mirrors Exercise ex2)
Task: Define cadence, latency, and restatement policy for promo analytics.
- Decision moment: Marketers adjust budgets at 10:00 and 16:00.
- Sources: POS batches hourly at :20; Ads APIs delayed up to 6 hours; restatements up to 48 hours.
- Tolerance: Preliminary data is acceptable if labeled.
Expected output: A schedule covering both sources, a “preliminary” rule, and a backfill policy.
Show example solution
Cadence: POS refresh hourly at :30; Ads refresh 4x (03:00, 09:00, 15:00, 21:00). Merge-model runs at 09:10 and 15:10 to support decision times.
SLO: “By 09:10/15:10, blended metrics updated; Ads data under 24h marked ‘preliminary’. Nightly backfill captures 48h restatements.”
Alerts: If blended freshness > 8 hours during 07:00–17:00, notify channel.
Checklist before you finalize
- Have you tied the schedule to a real decision time?
- Is the acceptable latency written as a number?
- Have you accounted for source delays and restatements?
- Do you specify freshness/latency alerts?
- Are blackout windows documented?
- Is the SLO one or two sentences others can verify?
Common mistakes and self-check
- Mistake: “Real-time” everywhere. Fix: Use decision moments to justify speed; prefer simple cadences when possible.
- Mistake: Ignoring vendor restatements. Fix: Add nightly backfills and “preliminary” labeling.
- Mistake: No alerts. Fix: Add freshness checks and clear thresholds.
- Mistake: Cadence not aligned with compute windows. Fix: Schedule after source availability; include retries.
- Mistake: Mid-meeting refresh surprises. Fix: Add blackout windows around key meetings.
Self-check prompt you can use
Read your SLO out loud. Can a non-technical stakeholder say: “I know when it updates, how fresh it is, and what happens if it’s late”?
Practical projects
- Build a freshness monitor that computes max age for 3 critical tables and renders a small status tile in your BI tool.
- Design and document SLOs for two dashboards: one daily executive, one intraday ops. Include alert rules and blackout windows.
- Implement a restatement backfill for a vendor API with a 72-hour correction window and add a “preliminary” badge in visuals.
Learning path
- Clarify user decisions and timing.
- Map source arrival patterns and constraints.
- Draft SLO with cadence, latency, coverage window, and alerts.
- Pilot the schedule; measure freshness/latency.
- Iterate and document in the dashboard description.
Who this is for
- BI Developers and Analysts defining dashboard update schedules.
- Analytics Engineers owning data models and SLAs/SLOs.
- Data PMs coordinating stakeholder expectations.
Prerequisites
- Basic SQL (aggregations, time functions).
- Familiarity with your orchestration tool’s scheduling (e.g., cron-style).
- Understanding of data source availability patterns.
Next steps
- Write SLOs for one of your existing dashboards and add freshness badges.
- Set up a simple alert for when freshness exceeds your threshold.
- Take the quick test below to validate your understanding.
Mini challenge
You support a customer support dashboard used at 09:00, 13:00, and 17:00. Tickets stream in with 2–4 minute lag; CRM user attributes sync hourly at :40. Propose a cadence, latency target, and a note on preliminary data. Keep it to 2 sentences.