luvv to helpDiscover the Best Free Online Tools
Topic 2 of 8

Anomaly Detection Basics

Learn Anomaly Detection Basics for free with explanations, exercises, and a quick test (for Analytics Engineer).

Published: December 23, 2025 | Updated: December 23, 2025

Who this is for

You build or maintain analytics pipelines and dashboards, and you want practical ways to catch spikes, drops, delays, and data drift before stakeholders do. Suitable for analytics engineers, BI developers, and data-savvy analysts.

Prerequisites

  • Basic SQL (GROUP BY, WINDOW functions)
  • Understanding of KPIs/metrics (counts, sums, rates)
  • Familiarity with time series concepts (daily/weekly granularity)

Why this matters

In real analytics engineering work, you will:

  • Detect broken pipelines when daily row counts drop to near zero.
  • Catch revenue spikes caused by duplicate loads before they land in executive reports.
  • Spot freshness issues when tables stop updating after a failed job.
  • Identify schema or distribution drift (e.g., suddenly 40% null emails) to prevent silent data quality degradation.

Anomaly detection is a safety net that triggers timely, actionable alerts with minimal noise.

Concept explained simply

An anomaly is a data point or pattern that departs from what’s normal for your metric. You define “normal” by a baseline (rules, statistics, or history). If a new value falls outside a tolerance band, you flag it.

Mental model

Think of each metric as a heartbeat. Healthy heartbeats vary but follow a rhythm (seasonality). You learn the rhythm from recent history and raise a flag when beats are too fast, too slow, or missing.

Core methods you can use today

1) Simple thresholds (fastest)

Define fixed min/max bounds from domain knowledge.

  • Pros: Easy, interpretable, no history needed.
  • Cons: Not adaptive to growth or seasonality.
  • Use when: You know safe ranges (e.g., email open_rate between 0 and 0.8).
-- Absolute threshold example
SELECT date, order_count,
       CASE WHEN order_count < 100 THEN 'ANOMALY' END AS flag
FROM daily_orders;
2) Z-score / standard deviation bands

Compare value to rolling mean and standard deviation. Flag when |z| > k (commonly 3).

  • Pros: Adapts to level changes, simple math.
  • Cons: Sensitive to outliers; assumes roughly normal noise.
  • Use when: Metric is fairly symmetric and stable in variance.
-- 28-day rolling baseline with z-score
WITH b AS (
  SELECT
    date,
    metric,
    AVG(metric) OVER (ORDER BY date ROWS BETWEEN 28 PRECEDING AND 1 PRECEDING) AS mu,
    STDDEV_POP(metric) OVER (ORDER BY date ROWS BETWEEN 28 PRECEDING AND 1 PRECEDING) AS sigma
  FROM daily_metric
)
SELECT *,
  CASE WHEN sigma > 0 AND ABS((metric - mu)/sigma) >= 3 THEN 'ANOMALY' END AS flag
FROM b;
3) IQR / robust bounds (for outliers)

Use median and interquartile range (IQR). Flag if value < Q1 - 1.5*IQR or > Q3 + 1.5*IQR.

  • Pros: Robust to extreme values.
  • Cons: Needs a history window; less sensitive to subtle shifts.
  • Use when: Data has outliers or is non-normal.
4) Rolling seasonal baselines

Compare today to the same weekday or to a rolling 7/14/28-day window to respect weekly patterns.

  • Pros: Handles seasonality; reduces false positives.
  • Cons: Needs enough history; window choices matter.
  • Use when: You have day-of-week effects (traffic lower on weekends).
5) Percent change monitors

Flag if value changes by more than X% vs baseline (e.g., previous day or rolling mean).

  • Pros: Unit-free, business-friendly.
  • Cons: Can over-trigger on low volumes.
  • Use when: Stakeholders care about relative moves.
6) Freshness and volume checks

Monitor time since last update and daily row counts.

  • Pros: Essential for pipelines; very interpretable.
  • Cons: Doesn’t capture subtle data drift.
  • Use when: Ensuring SLAs and job success.
7) Distribution drift (PSI/K-S) — simple view

Compare current distribution to historical. Flag if drift score exceeds threshold.

  • Pros: Catches silent schema/data changes.
  • Cons: More setup; choose bins and reference period carefully.
  • Use when: Monitoring critical categorical/continuous fields.

Worked examples

Example 1 — Daily orders z-score monitor

Goal: Flag extreme spikes/drops in daily order_count.

WITH hist AS (
  SELECT
    date, order_count,
    AVG(order_count) OVER (ORDER BY date ROWS BETWEEN 28 PRECEDING AND 1 PRECEDING) AS mu,
    STDDEV_POP(order_count) OVER (ORDER BY date ROWS BETWEEN 28 PRECEDING AND 1 PRECEDING) AS sigma
  FROM daily_orders
)
SELECT date, order_count,
       ROUND((order_count - mu)/NULLIF(sigma,0), 2) AS z,
       CASE WHEN sigma > 0 AND ABS((order_count - mu)/sigma) >= 3 THEN 'ANOMALY' END AS flag
FROM hist
ORDER BY date DESC
LIMIT 14;

Interpretation: If z ≤ -3, investigate missing loads or upstream filters. If z ≥ 3, check duplicate loads or currency issues.

Example 2 — Freshness SLA

Goal: Ensure table updated within 2 hours of schedule.

SELECT table_name,
       TIMESTAMPDIFF('minute', last_updated_at, CURRENT_TIMESTAMP) AS minutes_since_update,
       CASE WHEN TIMESTAMPDIFF('minute', last_updated_at, CURRENT_TIMESTAMP) > 120 THEN 'STALE' END AS flag
FROM table_freshness_status;

Interpretation: A STALE flag triggers a pipeline check and a rollback/reload if needed.

Example 3 — Revenue percent-change with weekday baseline

Goal: Alert if today’s revenue differs by more than 35% vs the average of the same weekday over the last 8 weeks.

WITH w AS (
  SELECT *, EXTRACT(DOW FROM date) AS dow FROM daily_revenue
), ref AS (
  SELECT w1.date, w1.revenue,
         (SELECT AVG(w2.revenue) FROM w w2
           WHERE w2.dow = w1.dow AND w2.date BETWEEN w1.date - INTERVAL '56 day' AND w1.date - INTERVAL '7 day') AS dow_avg
  FROM w w1
)
SELECT date, revenue,
       ROUND(100.0 * (revenue - dow_avg)/NULLIF(dow_avg,0), 1) AS pct_diff,
       CASE WHEN ABS((revenue - dow_avg)/NULLIF(dow_avg,0)) >= 0.35 THEN 'ANOMALY' END AS flag
FROM ref
ORDER BY date DESC
LIMIT 8;

Interpretation: Reduces false alarms by respecting weekly seasonality.

How to implement safely (step-by-step)

  1. Choose metrics: counts, sums, rates, null_rate, distinct_rate, freshness minutes.
  2. Pick baselines: thresholds for SLAs; rolling mean/std or IQR for values; weekday averages for seasonal metrics.
  3. Build a monitoring table: one row per metric-date with baseline, bounds, and flag.
  4. Add context: include pipeline/job ids, source table, and last code deploy hash if available.
  5. Alert wisely: group anomalies, add severity (high when zero-rows or stale), and include an “how to triage” runbook snippet.
  6. Review and tune: track precision/recall of alerts; adjust windows and thresholds to reduce noise.

Exercises

Try these and compare with the solutions. Progress is saved for logged-in users; the exercises work for everyone.

Exercise 1 — Rolling z-score alert

Create SQL to compute a 21-day rolling mean and std for daily_sessions and flag anomalies when |z| ≥ 3. Output: date, daily_sessions, mu, sigma, z, flag.

Hints
  • Use a window frame: ROWS BETWEEN 21 PRECEDING AND 1 PRECEDING.
  • Guard division by zero with NULLIF(sigma, 0).
Expected output

Recent dates with a column flag = 'ANOMALY' for extreme spikes/drops.

Exercise 2 — IQR bounds for null_rate

For a daily null_rate_email metric (0..1), compute Q1, Q3, and IQR over the last 30 historical days (excluding today). Flag today as anomalous if it falls outside [Q1 - 1.5*IQR, Q3 + 1.5*IQR].

Hints
  • Use APPROX_PERCENTILE or PERCENTILE_CONT if your warehouse supports it.
  • Exclude the current date from the reference window.
Expected output

A single row for today with columns: q1, q3, iqr, lower_bound, upper_bound, flag.

Exercise checklist

  • Window functions used correctly
  • Reference window excludes the current row
  • Division-by-zero guarded
  • Bounds and flags readable by non-technical teammates

Common mistakes and self-check

  • Too little history: Using 3–5 days leads to unstable baselines. Self-check: Do alerts swing wildly? Use ≥ 21–28 days where possible.
  • Ignoring seasonality: Comparing weekends to weekdays. Self-check: Plot by weekday averages and compare.
  • Alerting on noise: Using |z| ≥ 2 on highly volatile metrics. Self-check: Track false positives and tighten to 3 or use IQR.
  • No freshness guardrails: Value checks won’t help if the table didn’t load. Self-check: Always include a freshness SLA.
  • Single-point spikes only: Missing level shifts. Self-check: Require 2+ consecutive anomalies or use a rolling mean comparison.

Practical projects (portfolio-ready)

  1. Metric monitor mart: Build a table that tracks 5 metrics (volume, revenue, null_rate, dedupe_rate, freshness_min). Add rolling baselines and flags.
  2. Seasonal alerting: Implement weekday-aware revenue alerts and compare false positive counts vs non-seasonal alerts over one month.
  3. Drift watch: For a signup funnel, monitor country distribution weekly. Compute a simple PSI and alert when PSI ≥ 0.25.

Mini challenge

You have a daily active_users metric with strong weekend dips and occasional marketing spikes. Design a detection approach that balances sensitivity and noise. Specify:

  • Baseline choice (e.g., weekday-average last 8 weeks)
  • Alert rule (e.g., ±30% vs baseline OR |z| ≥ 3)
  • Noise control (e.g., require 2 consecutive days or severity tiers)
Suggested approach

Use weekday-average over 8–12 weeks; alert when |pct_diff| ≥ 30% or |z| ≥ 3; require 2 consecutive anomalies for low-severity, immediate alert for high-severity (drop > 60% or zero).

Learning path

  • Start: Freshness and volume monitors for each core table.
  • Next: Rolling baselines (z-score/IQR) for key business metrics.
  • Then: Seasonality-aware comparisons (weekday or 7/28-day).
  • Advanced: Distribution drift for critical columns; control charts for processes.

Next steps

  • Implement 1 freshness and 2 value monitors today.
  • Set alert severities and add runbook steps to each alert.
  • Review alert precision weekly and tune thresholds/windows.

Practice Exercises

2 exercises to complete

Instructions

Write SQL that computes a 21-day rolling mean (mu) and stddev (sigma) for daily_sessions, then outputs date, daily_sessions, mu, sigma, z, and flag='ANOMALY' when |z| ≥ 3.

Assume a table daily_sessions(date, sessions).

Expected Output
Recent dates with z-scores and a clear 'ANOMALY' flag on extreme spikes/drops.

Anomaly Detection Basics — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Anomaly Detection Basics?

AI Assistant

Ask questions about this tool