Why this matters
Anomaly spotting is a core skill in exploratory analysis. You will use it to:
- Monitor product and business metrics (e.g., sudden drop in conversions).
- Catch data quality issues early (e.g., missing batches, duplicated events).
- Detect fraud or abuse (e.g., unusual spike in refund rate).
- Find operational problems (e.g., API latency surge after a deploy).
Done well, it saves time, prevents bad decisions, and guides targeted investigation.
Concept explained simply
An anomaly is a data point or pattern that deviates from what you would reasonably expect given the usual behavior and context.
Think of your data as a daily commute. Some days are a bit faster or slower (normal variation). A closed bridge is a clear anomaly. Context matters: a rainy Monday may be slower than a sunny Sunday.
Mental model
- Baseline: What is normal? Estimate it with statistics (median/IQR, mean/std) and context (season, weekday, campaign).
- Surprise score: How far is the new point from the baseline? Use Z-score, robust Z (median/MAD), or IQR fences.
- Context gates: Compare apples to apples (e.g., Mondays vs Mondays, region A vs region A).
- Confirm and explain: Visualize, segment, and rule out data issues before declaring a true anomaly.
Robust vs classical measures
Classical: mean and standard deviation. Sensitive to outliers. Robust: median and MAD (Median Absolute Deviation) or IQR (Interquartile Range). Prefer robust methods when you expect outliers or skewed data.
Common thresholds
- Z-score: |z| > 3 (classical), or robust |zrobust| > 3.5.
- IQR rule: below Q1 − 1.5×IQR or above Q3 + 1.5×IQR.
Thresholds are heuristics. Tune them to balance misses vs false alarms.
Types of anomalies
- Point anomaly: a single data point is extreme.
- Contextual anomaly: unusual relative to context (e.g., low sales for Black Friday).
- Collective anomaly: a sequence deviates (e.g., gradual drift or plateau).
- Change-point: the underlying level/variance changes (e.g., after a release).
- Data-quality anomaly: missing/duplicated data, schema change, delayed loads.
How to spot anomalies step-by-step
- Define the metric and grain: what are you measuring and at what frequency (hourly/daily) or segment (country, device)?
- Visualize: line plot or histogram. Look for sudden spikes/drops, flatlines, or variance changes.
- Pick a baseline: use recent history, same weekday, or the same season. For skewed data, start with median/IQR or MAD.
- Compute a surprise score: Z-score, robust Z, or IQR fences. For time series with seasonality, decompose or compare within the same context (e.g., Mondays).
- Flag candidates: apply a threshold (e.g., |z| > 3, or outside IQR fences).
- Segment to confirm: split by channel, region, device. True anomalies often appear in some segments but not all.
- Rule out data issues: check missing data, duplicates, delayed ingestion, tracking changes, recent ETL changes.
- Document and act: write what happened, suspected cause, and follow-up checks. Share plots.
Time series tip
Remove seasonality (e.g., 7-day rolling median or STL decomposition). Then apply anomaly rules to residuals.
Worked examples
Example 1: Daily orders spike (robust Z with MAD)
Data (orders over 14 days): 98, 102, 101, 99, 100, 97, 250, 103, 96, 102, 99, 101, 98, 20
- Median = 99.5
- MAD = 2.0 (median of absolute deviations)
- Robust Z = 0.6745 × (x − 99.5) / 2.0
- Flags: Day 7 (250) and Day 14 (20) have |robust Z| ≫ 3.5 → anomalies.
Example 2: Conversion rate drop (contextual)
Weekday conversion rate baseline (past 8 Mondays): mean 3.2%, std 0.2%. Today Monday is 2.5%.
- Z = (2.5 − 3.2) / 0.2 = −3.5 → candidate anomaly.
- Segment by traffic source: drop concentrated in Paid Search. Check ad changes and landing page.
Example 3: IQR for session duration
Sample durations (min): 1.2, 1.3, 1.4, 1.5, 1.6, 8.0
- Q1 = 1.3, Q3 = 1.55 → IQR = 0.25
- Upper fence = 1.55 + 1.5×0.25 = 1.925 → 8.0 is an outlier.
Why robust methods shine
One extreme point can inflate mean and std, hiding true anomalies. Median/MAD and IQR resist this influence.
Practical checklist
- I plotted the series and noted context (weekday, season, release, campaign).
- I chose a baseline that matches context (same weekday/segment).
- I used robust stats (median/MAD or IQR) when data looked skewed.
- I set a clear threshold and kept it consistent for the analysis.
- I segmented to validate the anomaly and reduce false positives.
- I checked for data-quality issues before root cause analysis.
- I documented findings and next steps with plots.
Exercises
Note: The quick test is available to everyone; only logged-in users get saved progress.
Exercise 1 — Flag anomalies with median and MAD
Dataset (daily orders over 14 days):
98, 102, 101, 99, 100, 97, 250, 103, 96, 102, 99, 101, 98, 20
- Compute the median and MAD (Median Absolute Deviation) for the full series.
- Compute robust Z for each point: rz = 0.6745 × (x − median) / MAD.
- Flag anomalies where |rz| > 3.5.
Need a nudge?
- Median is the middle of the sorted list; with even N, average the two middle values.
- MAD is the median of the absolute deviations from the median.
Common mistakes and self-check
- Using global mean/std on seasonal data: leads to false positives. Self-check: compare within same weekday/season.
- Declaring anomalies without segmentation: you may miss the real source. Self-check: slice by channel/region/device.
- Ignoring data issues: a late ETL can look like a drop. Self-check: verify counts, nulls, pipeline logs, and recent schema changes.
- Threshold hopping: changing thresholds until you get the answer you want. Self-check: predefine and justify thresholds.
- Overfitting the window: using too small a baseline window. Self-check: test stability across adjacent windows.
Self-audit mini list
- Did I choose the right context for baseline?
- Did I visualize raw and residual (de-seasonalized) series?
- Did I verify data completeness and timeliness?
Mini challenge
You monitor three metrics: daily signups, activation rate, and support tickets. Signups are normal, activation rate drops by 20% on mobile iOS only, and support tickets spike for “payment fail”. Outline a 5-step plan to confirm anomaly and find the cause using segmentation and robust baselines. Write your steps and checks.
Who this is for
- Data analysts who explore and monitor metrics.
- Anyone owning dashboards, alerts, or experiment monitoring.
Prerequisites
- Comfort with basic statistics (mean, median, variance, percentiles).
- Ability to plot time series and distributions.
- Basic spreadsheet or Python/R skills for simple calculations.
Learning path
- Descriptive stats refresh (center, spread, percentiles).
- Visual EDA for time series and distributions.
- Robust anomaly rules (IQR, MAD-based Z).
- Contextual analysis (seasonality, segmentation).
- Root cause routines and documentation.
Practical projects
- Build a weekly anomaly review: pick 3 KPIs, define baselines, apply robust detection, summarize findings.
- Create a one-pager playbook: checklist, thresholds, and data quality checks your team can reuse.
- Simulate anomalies by injecting spikes/drops into sample data and verify your method catches them.
Next steps
- Apply robust baselines to your top KPI and set a consistent threshold.
- Introduce one segmentation cut to validate anomalies (e.g., device type).
- Design a simple anomalies log with date, metric, method, threshold, segments, and outcome.