How to learn Anomaly Spotting for Exploratory Analysis in Data Analyst for free

Why this matters

Anomaly spotting is a core skill in exploratory analysis. You will use it to:

Monitor product and business metrics (e.g., sudden drop in conversions).
Catch data quality issues early (e.g., missing batches, duplicated events).
Detect fraud or abuse (e.g., unusual spike in refund rate).
Find operational problems (e.g., API latency surge after a deploy).

Done well, it saves time, prevents bad decisions, and guides targeted investigation.

Concept explained simply

An anomaly is a data point or pattern that deviates from what you would reasonably expect given the usual behavior and context.

Think of your data as a daily commute. Some days are a bit faster or slower (normal variation). A closed bridge is a clear anomaly. Context matters: a rainy Monday may be slower than a sunny Sunday.

Mental model

Baseline: What is normal? Estimate it with statistics (median/IQR, mean/std) and context (season, weekday, campaign).
Surprise score: How far is the new point from the baseline? Use Z-score, robust Z (median/MAD), or IQR fences.
Context gates: Compare apples to apples (e.g., Mondays vs Mondays, region A vs region A).
Confirm and explain: Visualize, segment, and rule out data issues before declaring a true anomaly.

Robust vs classical measures

Classical: mean and standard deviation. Sensitive to outliers. Robust: median and MAD (Median Absolute Deviation) or IQR (Interquartile Range). Prefer robust methods when you expect outliers or skewed data.

Common thresholds

Z-score: |z| > 3 (classical), or robust |z_robust| > 3.5.
IQR rule: below Q1 − 1.5×IQR or above Q3 + 1.5×IQR.

Thresholds are heuristics. Tune them to balance misses vs false alarms.

Types of anomalies

Point anomaly: a single data point is extreme.
Contextual anomaly: unusual relative to context (e.g., low sales for Black Friday).
Collective anomaly: a sequence deviates (e.g., gradual drift or plateau).
Change-point: the underlying level/variance changes (e.g., after a release).
Data-quality anomaly: missing/duplicated data, schema change, delayed loads.

How to spot anomalies step-by-step

Define the metric and grain: what are you measuring and at what frequency (hourly/daily) or segment (country, device)?
Visualize: line plot or histogram. Look for sudden spikes/drops, flatlines, or variance changes.
Pick a baseline: use recent history, same weekday, or the same season. For skewed data, start with median/IQR or MAD.
Compute a surprise score: Z-score, robust Z, or IQR fences. For time series with seasonality, decompose or compare within the same context (e.g., Mondays).
Flag candidates: apply a threshold (e.g., |z| > 3, or outside IQR fences).
Segment to confirm: split by channel, region, device. True anomalies often appear in some segments but not all.
Rule out data issues: check missing data, duplicates, delayed ingestion, tracking changes, recent ETL changes.
Document and act: write what happened, suspected cause, and follow-up checks. Share plots.

Time series tip

Remove seasonality (e.g., 7-day rolling median or STL decomposition). Then apply anomaly rules to residuals.

Worked examples

Example 1: Daily orders spike (robust Z with MAD)

Data (orders over 14 days): 98, 102, 101, 99, 100, 97, 250, 103, 96, 102, 99, 101, 98, 20

Median = 99.5
MAD = 2.0 (median of absolute deviations)
Robust Z = 0.6745 × (x − 99.5) / 2.0
Flags: Day 7 (250) and Day 14 (20) have |robust Z| ≫ 3.5 → anomalies.

Example 2: Conversion rate drop (contextual)

Weekday conversion rate baseline (past 8 Mondays): mean 3.2%, std 0.2%. Today Monday is 2.5%.

Z = (2.5 − 3.2) / 0.2 = −3.5 → candidate anomaly.
Segment by traffic source: drop concentrated in Paid Search. Check ad changes and landing page.

Example 3: IQR for session duration

Sample durations (min): 1.2, 1.3, 1.4, 1.5, 1.6, 8.0

Q1 = 1.3, Q3 = 1.55 → IQR = 0.25
Upper fence = 1.55 + 1.5×0.25 = 1.925 → 8.0 is an outlier.

Why robust methods shine

One extreme point can inflate mean and std, hiding true anomalies. Median/MAD and IQR resist this influence.

Practical checklist

I plotted the series and noted context (weekday, season, release, campaign).
I chose a baseline that matches context (same weekday/segment).
I used robust stats (median/MAD or IQR) when data looked skewed.
I set a clear threshold and kept it consistent for the analysis.
I segmented to validate the anomaly and reduce false positives.
I checked for data-quality issues before root cause analysis.
I documented findings and next steps with plots.

Exercises

Note: The quick test is available to everyone; only logged-in users get saved progress.

Exercise 1 — Flag anomalies with median and MAD

Dataset (daily orders over 14 days):

98, 102, 101, 99, 100, 97, 250, 103, 96, 102, 99, 101, 98, 20

Compute the median and MAD (Median Absolute Deviation) for the full series.
Compute robust Z for each point: rz = 0.6745 × (x − median) / MAD.
Flag anomalies where |rz| > 3.5.

Need a nudge?

Median is the middle of the sorted list; with even N, average the two middle values.
MAD is the median of the absolute deviations from the median.

Common mistakes and self-check

Using global mean/std on seasonal data: leads to false positives. Self-check: compare within same weekday/season.
Declaring anomalies without segmentation: you may miss the real source. Self-check: slice by channel/region/device.
Ignoring data issues: a late ETL can look like a drop. Self-check: verify counts, nulls, pipeline logs, and recent schema changes.
Threshold hopping: changing thresholds until you get the answer you want. Self-check: predefine and justify thresholds.
Overfitting the window: using too small a baseline window. Self-check: test stability across adjacent windows.

Self-audit mini list

Did I choose the right context for baseline?
Did I visualize raw and residual (de-seasonalized) series?
Did I verify data completeness and timeliness?

Mini challenge

You monitor three metrics: daily signups, activation rate, and support tickets. Signups are normal, activation rate drops by 20% on mobile iOS only, and support tickets spike for “payment fail”. Outline a 5-step plan to confirm anomaly and find the cause using segmentation and robust baselines. Write your steps and checks.

Who this is for

Data analysts who explore and monitor metrics.
Anyone owning dashboards, alerts, or experiment monitoring.

Prerequisites

Comfort with basic statistics (mean, median, variance, percentiles).
Ability to plot time series and distributions.
Basic spreadsheet or Python/R skills for simple calculations.

Learning path

Descriptive stats refresh (center, spread, percentiles).
Visual EDA for time series and distributions.
Robust anomaly rules (IQR, MAD-based Z).
Contextual analysis (seasonality, segmentation).
Root cause routines and documentation.

Practical projects

Build a weekly anomaly review: pick 3 KPIs, define baselines, apply robust detection, summarize findings.
Create a one-pager playbook: checklist, thresholds, and data quality checks your team can reuse.
Simulate anomalies by injecting spikes/drops into sample data and verify your method catches them.

Next steps

Apply robust baselines to your top KPI and set a consistent threshold.
Introduce one segmentation cut to validate anomalies (e.g., device type).
Design a simple anomalies log with date, metric, method, threshold, segments, and outcome.

Menu

Anomaly Spotting

Table of Contents