luvv to helpDiscover the Best Free Online Tools
Topic 6 of 8

Automated Checks And Alerts Basics

Learn Automated Checks And Alerts Basics for free with explanations, exercises, and a quick test (for BI Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

Automated checks and alerts keep dashboards trustworthy without manual babysitting. As a BI Analyst, you will:

  • Catch broken pipelines early (freshness/volume checks) before stakeholders notice.
  • Detect data drift (spikes in nulls or outliers) that can mislead reports.
  • Escalate issues with clear, actionable alerts and reduce alert fatigue.
  • Log results for audits and continuous improvement.

Concept explained simply

An automated check is a small test that runs on a schedule to verify data assumptions (like “no null emails” or “orders arrive hourly”). An alert is a message sent when a check breaks a threshold.

Mental model

Think of a building with smoke detectors:

  • Detector = Check logic (e.g., null rate > 1%).
  • Polling = Schedule (e.g., hourly).
  • Alarm = Alert sent to the right people with context.
  • Logbook = Monitoring table that stores every run.

Core building blocks

1) Check types you’ll use often
  • Freshness: last data timestamp within X time.
  • Volume: row count within expected range.
  • Null/duplicate rate: percent of bad records below threshold.
  • Range/constraint: values within valid bounds, referential integrity holds.
  • Drift: distribution change versus baseline (simple: mean/median shift).
2) Thresholds
  • Static: fixed limits (e.g., nulls < 1%).
  • Dynamic: based on history (moving average, MAD, seasonality).
3) Scheduling
  • Align with data arrival (e.g., 15 minutes after ETL).
  • Use maintenance windows to mute during known outages.
4) Alert routing
  • Severity levels (info, warning, critical).
  • Channels (email/ChatOps/on-call). Send only what matters to each audience.
5) Logging
  • Write every run to a monitoring table: check_name, status, metrics, runtime, error_message.
  • Track false positives to tune thresholds.

Worked examples

Example 1 — Freshness check (orders)

Goal: Confirm the latest ingested record is not older than 2 hours.

-- Postgres-style SQL
SELECT
  MAX(ingested_at) AS last_ingest,
  NOW() AS checked_at,
  EXTRACT(EPOCH FROM NOW() - MAX(ingested_at))/3600.0 AS hours_delayed,
  CASE WHEN MAX(ingested_at) < NOW() - INTERVAL '2 hours' THEN TRUE ELSE FALSE END AS is_stale
FROM analytics.orders;

Alert if is_stale = TRUE. Include hours_delayed and affected dashboards in the alert message.

Example 2 — Null rate check (customer_email)

Goal: Keep null emails below 1% for yesterday’s data.

WITH stats AS (
  SELECT
    COUNT(*) AS total_rows,
    SUM(CASE WHEN customer_email IS NULL OR customer_email = '' THEN 1 ELSE 0 END) AS nulls
  FROM analytics.customers
  WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'
    AND created_at < CURRENT_DATE
)
SELECT
  total_rows,
  nulls,
  ROUND(100.0 * nulls / NULLIF(total_rows, 0), 2) AS null_rate_pct,
  CASE WHEN 100.0 * nulls / NULLIF(total_rows, 0) > 1.0 THEN TRUE ELSE FALSE END AS breach
FROM stats;

Alert if breach = TRUE. Guard against divide-by-zero with NULLIF.

Example 3 — Volume anomaly (moving baseline)

Goal: Flag today’s order count if it deviates from the last 7 days by more than 3 standard deviations.

WITH daily AS (
  SELECT
    DATE(created_at) AS d,
    COUNT(*) AS cnt
  FROM analytics.orders
  WHERE created_at >= CURRENT_DATE - INTERVAL '8 days'
  GROUP BY 1
),
baseline AS (
  SELECT AVG(cnt) AS mean_cnt, STDDEV_SAMP(cnt) AS sd
  FROM daily
  WHERE d < CURRENT_DATE
),
today AS (
  SELECT cnt FROM daily WHERE d = CURRENT_DATE
)
SELECT
  t.cnt AS today_cnt,
  b.mean_cnt,
  b.sd,
  CASE WHEN b.sd IS NULL THEN FALSE
       WHEN t.cnt < b.mean_cnt - 3*b.sd OR t.cnt > b.mean_cnt + 3*b.sd THEN TRUE
       ELSE FALSE END AS is_anomaly;

If your data is sparse, use median/MAD or a ±20% band instead of standard deviation.

Design alerts people read

  • Keep messages short and actionable.
  • Include: what broke, severity, metric value vs. threshold, scope (table/partition), suspected impact, and owner.
  • Link to your monitoring dashboard or runbook name (mention, don’t include URLs here).
Alert template
[CRITICAL] Freshness breach — analytics.orders
Observed: hours_delayed=3.4 (threshold <= 2)
Scope: table=analytics.orders, partition=today
Impact: Order dashboards may be stale
Action: Run job 'orders_ingest' then re-run check 'orders_freshness'
Owner: Data Platform On-call

Setup steps (quick start)

  1. Pick 3 checks: freshness (core tables), volume (daily), null rate (key columns).
  2. Create a monitoring table with columns: check_name, run_at, status, metric, threshold, details, error_message.
  3. Write SQL for each check; insert a row into the monitoring table with the result and status.
  4. Schedule checks right after ETL completes; add maintenance windows if needed.
  5. Route alerts by severity: warnings to channel, critical to on-call.
  6. Dry-run for a week; adjust thresholds to cut false positives.

Exercises

These mirror the tasks below in the Exercises panel. Do them in SQL or your BI warehouse.

Exercise 1 — Duplicate rate (orders)

Write a query to compute duplicates for order_id in the last 24 hours. Return:

  • duplicate_count
  • duplicate_rate_pct (0–100)

Alert if duplicate_rate_pct > 0.10 (0.10%).

  • Checklist:
    • Uses a 24-hour time filter.
    • Handles empty sets safely.
    • One-row numeric output.
Exercise 2 — Dynamic volume threshold

Create a query that flags today as anomaly if today’s row count deviates more than 20% from the average of the previous 7 days.

  • Checklist:
    • Excludes today from baseline.
    • Computes band = mean ± 20%.
    • Outputs is_anomaly boolean.

Common mistakes and self-check

  • Missing time windows: Checks run before data lands. Self-check: Is schedule offset from ETL finish?
  • Hard thresholds everywhere: Leads to alert fatigue. Self-check: Do at least volume checks use a baseline?
  • No logs: Only notifications, no history. Self-check: Can you show a 30-day trend per check?
  • Unhandled errors: Query failure means no alert. Self-check: Do you alert on check errors as critical?
  • Vague alerts: No metric values or actions. Self-check: Does your message include metric vs threshold and next steps?

Who this is for and prerequisites

  • Who: BI Analysts, Analytics Engineers, Data-savvy PMs who own dashboards.
  • Prerequisites: Basic SQL (GROUP BY, CASE), understanding of your ETL schedule, access to the warehouse.

Practical projects

  • Project 1: Monitoring table + 3 core checks (freshness, volume, null rate) for your top two tables.
  • Project 2: Weekly alert digest summarizing all breaches, top noisy checks, and tuning suggestions.
  • Project 3: Severity-based routing with maintenance windows, plus a simple runbook per check.

Learning path

  • Start: Static thresholds on key checks.
  • Next: Dynamic baselines (moving average, median/MAD).
  • Then: Partition-level checks (per region/day) and referential integrity checks.
  • Advanced: Seasonality-aware baselines and incident postmortems.

Mini challenge

Pick one dashboard your stakeholders use daily. Implement two checks that protect it (freshness and volume), add a clear alert message, and run them for 7 days. Tune thresholds to reduce false positives by at least 50%.

Next steps

  • Do the exercises below, then take the Quick Test at the end.
  • Note: The test is available to everyone; only logged-in users will see saved progress.

Practice Exercises

2 exercises to complete

Instructions

Write a SQL query to compute duplicates for order_id in the last 24 hours. Output a single row with:

  • duplicate_count
  • duplicate_rate_pct (0–100, two decimals)

Alert if duplicate_rate_pct > 0.10 (0.10%). Handle empty result sets safely.

Expected Output
One row: duplicate_count >= 0; duplicate_rate_pct between 0 and 100 (e.g., duplicate_count=3, duplicate_rate_pct=0.12).

Automated Checks And Alerts Basics — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Automated Checks And Alerts Basics?

AI Assistant

Ask questions about this tool