Skill Not Found

Why Experiment Design matters for Product Analysts

Experiment design turns product ideas into measurable learning. As a Product Analyst, you help teams decide what to test, how to measure it, and when to trust the results. Good design avoids false wins, reduces risk, and accelerates product decisions.

Translate business problems into testable hypotheses.
Choose primary and guardrail metrics that reflect real value.
Define the right population and randomization unit to avoid bias.
Estimate sample size, MDE, and duration to plan timelines.
Write a pre-experiment analysis plan to prevent p-hacking and scope creep.

Typical analyst responsibilities unlocked by this skill

Partner with PMs to frame experiments aligned to business goals.
Design trustworthy A/B tests and holdouts.
Set success criteria, guardrails, and stop/go rules before launch.
Run data quality checks (SRM, randomization checks) and interpret results.

Who this is for

Product Analysts and Data Analysts moving into experimentation.
PMs and Growth practitioners who want to design reliable A/B tests.

Prerequisites

Comfort with basic statistics (proportions, means, confidence intervals).
Basic SQL to define populations and compute metrics.
Familiarity with your product's key metrics and data sources.

Practical roadmap

1) Frame the business problem
Clarify the objective, constraints, and risks. Translate into a measurable question.
- Outcome: a one-sentence problem statement and decision you will inform.
2) Define hypothesis and success criteria
Write a directional hypothesis and choose primary/guardrail metrics with target direction.
- Outcome: H1/H0, success threshold, and guardrails defined.
3) Choose population and randomization unit
Specify who is eligible and how you will randomize (user, session, account, geo).
- Outcome: eligibility query and assignment plan that avoids spillovers.
4) Plan sample size, MDE, and duration
Estimate the minimum detectable effect and duration using baseline rates and traffic.
- Outcome: required sample per variant and projected end date.
5) Write the pre-experiment analysis plan
Lock down all choices: metrics, windows, segments, checks, stop rules, and how you will handle missing data.
- Outcome: shared plan approved by stakeholders before launch.
6) Launch checks and monitoring
Verify sample ratio, randomization, and metric logging early. Monitor guardrails.
- Outcome: early detection of SRM, logging gaps, or feature leakage.

Milestone checkpoints

You can state a testable hypothesis in one sentence.
You can compute sample size from baseline and MDE.
You can explain why your randomization unit is safe from contamination.
You can show a draft analysis plan that a PM can sign off.

Worked examples

1) From business question to testable hypothesis

Scenario: The team wants to add a one-click reorder button on the order history page to increase repeat purchases this month.

Business goal: Raise 30-day repeat purchase rate.
Primary metric: 30-day repeat purchase rate per eligible user.
Guardrails: Support ticket rate, checkout error rate, average delivery time.
Hypothesis: Users with the reorder button will have a higher 30-day repeat purchase rate than control.
Success: +5% relative lift or more, no guardrail degradation.

2) Choosing the randomization unit

Feature is user-facing and persists across sessions. Unit of analysis is user. Risk of contamination if split by session because users may revisit.

Randomize at user level with sticky assignment.
Avoid account-level randomization if multiple users share accounts and affect each other.

3) Sample size and MDE (binary outcome)

Baseline repeat rate p=0.20. MDE = +10% relative (to 22%). Two-sided test, alpha=0.05, power=0.80.

Quick approximation

Using a standard two-proportion power calc:
Input: p1=0.20, p2=0.22, alpha=0.05, power=0.80
Output: ~15,500 users per variant (approximate; exact depends on method).
Rule of thumb: smaller MDE -> larger sample; lower baseline -> larger sample.

4) Duration planning

Daily eligible traffic: ~10,000 users; 50/50 split. Required per variant: 15,500.

Days needed ≈ 31,000 / 10,000 ≈ 3.1 → plan for 4–5 days to cover day-of-week effects and potential drop-offs.

5) Guardrail validation plan

Define thresholds and monitoring for each guardrail.

Support ticket rate: must not increase by more than 2% relative.
Checkout error rate: must not increase; alert if p-value < 0.10 for harm.
Delivery time: no significant increase in mean.

6) Simple SQL patterns

Stable random assignment

-- Example: 50/50 user-level assignment using a hash bucket
WITH eligible AS (
  SELECT DISTINCT user_id
  FROM orders
  WHERE created_at >= DATE '2025-01-01'
)
SELECT
  user_id,
  CASE WHEN MOD(ABS(FARM_FINGERPRINT(CAST(user_id AS STRING))), 100) < 50
       THEN 'control' ELSE 'treatment' END AS variant
FROM eligible;

Compute a primary metric

-- 30-day repeat purchase rate per user
WITH base AS (
  SELECT user_id, MIN(order_date) AS first_order
  FROM orders
  GROUP BY user_id
),
follow AS (
  SELECT o.user_id
  FROM orders o
  JOIN base b USING (user_id)
  WHERE o.order_date BETWEEN b.first_order AND b.first_order + INTERVAL 30 DAY
    AND o.order_id <> FIRST_VALUE(o.order_id) OVER (PARTITION BY o.user_id ORDER BY o.order_date)
  GROUP BY o.user_id
)
SELECT variant,
  AVG(CASE WHEN f.user_id IS NOT NULL THEN 1 ELSE 0 END) AS repeat_rate_30d
FROM assignment a
LEFT JOIN follow f USING (user_id)
GROUP BY variant;

Drills and exercises

Write a one-sentence hypothesis for a new onboarding tooltip. Specify primary and guardrail metrics.
Given p=0.05 baseline conversion and MDE +0.5 pp, decide if the test is feasible in 2 weeks with 50k daily eligible sessions.
Identify the safest randomization unit for a price display experiment and explain why.
Draft a pre-experiment analysis plan with a stop/go rule and data quality checks.
List three risks of contamination and one mitigation for each.

Common mistakes and debugging tips

Peeking at results early: inflates false positives. Tip: predefine analysis date or use sequential methods with corrections.
Wrong randomization unit: leads to spillovers. Tip: match unit to exposure and outcome; use sticky assignment.
Underpowered tests: inconclusive results. Tip: increase MDE, traffic, or duration; prioritize higher-variance reduction methods if available.
Metric misalignment: optimizing clicks, not revenue. Tip: choose a primary metric tied to the decision.
Ignoring SRM (Sample Ratio Mismatch): sign of broken assignment/logging. Tip: chi-square check early; pause if SRM detected.
Multiple comparisons without control: too many segments. Tip: pre-specify segments or adjust interpretation and alpha.

Quick SRM check

Expected 50/50 split; observed 52/48 with large N. Run a chi-square goodness-of-fit test. If p-value < 0.01, investigate assignment and eligibility filters.

Mini project: Design a retention experiment

Goal: Improve 14-day retention by adding a weekly tips email for new users.

Frame the problem and write H1/H0.
Pick primary metric (e.g., 14-day active rate) and guardrails (unsubscribe, complaint rate, support tickets).
Define population: new users created in the test window; exclude users who opted out of emails.
Randomize at user level; plan sticky assignment and logging checks.
Estimate MDE and sample size; propose duration based on signups/day.
Write a pre-experiment plan including SRM monitoring and a success decision rule.

Deliverables checklist

One-page brief with hypothesis, metrics, and success rules.
Eligibility SQL sketch and assignment approach.
Sample size and duration math with assumptions.
Pre-experiment analysis plan.

Practical projects

Run a button color test on a staging environment; practice SRM checks and metric logging validation.
Design an upsell banner experiment with a revenue guardrail; present a go/no-go decision memo.
Plan a geo holdout to measure notification impact on DAU while avoiding cross-user contamination.

Learning path

Start with Business Problem Framing and Hypothesis Definition.
Move to Primary and Guardrail Metrics, then Population and Randomization Unit.
Practice Sample Size, MDE, and Duration planning.
Finalize with a solid Pre-Experiment Analysis Plan and launch checks.

Next steps

Complete the subskills below to build solid foundations.
Take the skill exam to check your readiness. Anyone can take it for free; logged-in users get saved progress.
Apply the mini project at work or with sample datasets to build a portfolio artifact.

Menu

Experiment Design

Table of Contents

Why Experiment Design matters for Product Analysts

Who this is for

Prerequisites

Practical roadmap

Worked examples

1) From business question to testable hypothesis

2) Choosing the randomization unit

3) Sample size and MDE (binary outcome)

4) Duration planning

5) Guardrail validation plan

6) Simple SQL patterns

Drills and exercises

Common mistakes and debugging tips

Mini project: Design a retention experiment

Practical projects

Learning path

Next steps

Topics

Business Problem Framing

Hypothesis Definition

Primary And Guardrail Metrics

Experiment Population Definition

Randomization Unit Selection

Sample Size And MDE Basics

Experiment Duration Planning

Pre Experiment Analysis Plan

Have questions about Experiment Design?

AI Assistant