luvv to helpDiscover the Best Free Online Tools
Topic 3 of 8

Data Validation During Test

Learn Data Validation During Test for free with explanations, exercises, and a quick test (for Product Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Who this is for

Product Analysts and experiment owners who run A/B tests and need to ensure the data you see during the test is trustworthy before making decisions.

Prerequisites

  • Basic A/B testing concepts (variants, randomization, metrics)
  • Familiarity with your product’s key events (e.g., page_view, add_to_cart, purchase)
  • Ability to read dashboards or write simple queries

Why this matters

During a live test, data can drift or break: traffic may split incorrectly, events may double-fire or stop firing, bots may inflate metrics, or guardrails (like error rate) may spike. Catching issues early prevents wasted time and wrong decisions.

Typical on-the-job tasks:

  • Daily SRM (Sample Ratio Mismatch) checks on exposure counts
  • Monitoring guardrail metrics (latency, errors, uninstalls, refund rate)
  • Comparing client vs server events to detect loss or duplication
  • Ensuring assignment is sticky (users don’t hop between variants)
  • Pausing or continuing tests based on validation outcomes

Concept explained simply

Data validation during test is a set of quick, repeatable checks that confirm the experiment is running as designed and the data is reliable. Think of it as a preflight checklist you repeat daily until landing.

Mental model

Use the 4C mental model:

  • Count: Are exposure counts and key events in expected ranges? (SRM check)
  • Consistency: Are assignment and metrics consistent across platforms, segments, and days?
  • Continuity: Are tracking schemas unchanged mid-test? Any releases affecting events?
  • Control: Are guardrails under control (no harm to stability or user experience)?

What to validate during a running test

  • Randomization health: Run SRM checks on exposure counts (e.g., 50/50 split). Use a chi-square test to detect mismatch.
  • Assignment stickiness: Users should remain in their assigned variant (check by user_id across sessions/devices).
  • Event integrity: Watch for sudden drops/spikes in key events, duplicated events, or schema changes.
  • Metric sanity: Compare variant baselines to recent history; big early swings often signal tracking issues.
  • Guardrails: Error rate, latency, crash rate, unsubscribe/refund rate should not exceed safe thresholds.
  • Traffic quality: Filter bots and internal traffic; review sudden changes in geography, device, or referrer mix.
  • Data freshness and late events: Confirm update cadence and whether late-arriving data is backfilled consistently.
  • Release coordination: Note any app/web releases during the test that may affect logging.
How to run a fast SRM check (chi-square)
  1. Find expected counts per variant (e.g., 50/50 of total exposures).
  2. Use chi-square: sum((observed - expected)^2 / expected) across variants.
  3. With 1 degree of freedom, a statistic above ~3.84 indicates p < 0.05 (SRM).

If SRM is detected, pause interpretation and investigate randomization, allocation, and filtering.

Worked examples

Example 1: SRM reveals a traffic split issue

Planned split 50/50. Observed after a day: Control 98,400, Variant 101,600 (total 200,000).

  1. Expected: 100,000 each.
  2. Chi-square: ((98,400-100,000)^2/100,000) + ((101,600-100,000)^2/100,000) = 51.2.
  3. 51.2 ≫ 3.84 → SRM detected.

Action: Investigate allocation rules (e.g., geo filter applied to only one variant), assignment ID source, bot filtering asymmetry, or a rollout flag overriding traffic routing. Pause decision-making until fixed.

Example 2: Event loss isolated to one variant

Symptoms: Page views steady across variants, but add_to_cart down 40% in Variant B only. Purchases (server-side) are stable.

Interpretation: Likely client-side event loss in Variant B (front-end change affecting logging).

Action: Compare client vs server ratios, replay a QA session forced into Variant B, check release notes for the variant’s UI changes. If confirmed, hotfix or pause test; backfill if possible.

Example 3: Guardrail breach despite positive conversion

Variant shows +2% conversion but API error rate doubled.

Action: Guardrails protect the system and users. Pause or roll back, then analyze error spikes by endpoint and time. A winning conversion with harmful side effects is not a win.

Hands-on exercises

Do these now. They mirror the graded Quick Test.

Exercise 1 (ex1): SRM check

You planned 50/50. After 24h: Control 120,900; Variant 129,100. Decide if SRM is present and if you should pause interpretation.

Tip: Compute expected counts and use chi-square with 1 degree of freedom.

Exercise 2 (ex2): Diagnose event anomalies

Variant A vs B: page_view similar; product_view similar; add_to_cart −35% only in B; checkout_started similar; purchases (server-side) similar; average latency unchanged; no release after test start. What is the most likely cause and your next step?

Validation checklist during test

  • Exposure SRM check passes (p ≥ 0.05)
  • Assignment is sticky across sessions/devices
  • Client vs server event ratios stable
  • No unexplained spikes/drops in key events
  • Guardrails within thresholds (errors, latency, crashes)
  • Traffic composition stable (geo, device, referrer)
  • No mid-test tracking schema changes
  • Data freshness matches expectations; late events backfilled
How to self-check your validation workflow
  • Write a short daily note: “SRM: pass/fail; Guardrails: pass/fail; Notes: …”
  • If a check fails twice consecutively, escalate and consider pausing the test.
  • Label dashboards with the exact metric definitions to avoid confusion.

Common mistakes and how to self-check

  • Ignoring early SRM because “it will normalize” → If chi-square flags SRM at meaningful sample sizes, investigate now.
  • Mixing user- and session-level assignment → Users switch variants across devices; enforce user-level bucketing when possible.
  • Trusting one data source → Cross-validate client vs server or multiple pipelines.
  • Letting releases alter logging mid-test → Freeze event schemas or document changes and adjust analysis windows.
  • Overreacting to day 1 noise → Small samples fluctuate. Validate with SRM and integrity checks before acting on performance.

Practical projects

  • Build an “Experiment Health” dashboard: SRM tile, assignment stickiness tile, guardrail trends, client vs server ratios.
  • Create an SRM calculator sheet: inputs (counts, expected split) → result (chi-square, p-value, decision).
  • Write a one-page Experiment Validation SOP used daily during any test.
  • Design a synthetic event test: fire controlled events pre-production to detect loss/duplication.

Learning path

  1. Instrument events correctly (naming, properties, IDs)
  2. Experiment design (randomization, units of analysis, guardrails)
  3. Data validation during test (this lesson)
  4. Post-test analysis (significance, lift, heterogeneity)
  5. Experiment reporting and decisioning

Next steps

  • Automate your SRM and guardrail checks to run daily
  • Document thresholds for pause/continue decisions
  • Practice with historical experiments to sharpen your pattern recognition

Mini challenge

You see no SRM, but client-side add_to_cart is down 25% and server-side purchases are flat. In two sentences, state your hypothesis and the one validation step you’ll do today.

Quick Test

The quick test below is available to everyone. Progress is saved for logged-in users.

Practice Exercises

2 exercises to complete

Instructions

Planned split: 50/50. After 24 hours: Control = 120,900 exposures; Variant = 129,100 exposures.

  1. Compute expected counts per variant.
  2. Calculate chi-square with 1 degree of freedom.
  3. Decide whether SRM is present (use 0.05 as the threshold) and whether to pause interpretation.
Expected Output
SRM detected; pause interpretation and investigate allocation/assignment.

Data Validation During Test — Quick Test

Test your knowledge with 7 questions. Pass with 70% or higher.

7 questions70% to pass

Have questions about Data Validation During Test?

AI Assistant

Ask questions about this tool