How to learn Randomization And Unit Selection for Experiment Design in Data Scientist for free

Why this matters

Choosing the right experimental unit and randomization scheme is the foundation of trustworthy A/B tests. As a Data Scientist, you will be asked to:

Decide whether to randomize by user, session, device, household, store, city, or time.
Prevent spillovers (interference) and contamination between groups.
Balance covariates, detect assignment issues, and ensure results are analyzable.
Communicate trade-offs between precision, power, and practicality.

Concept explained simply

Two questions define solid experiments:

Unit selection: Who or what receives the treatment? (e.g., a user, a store, a city)
Randomization: How do we assign units to groups in a fair, reproducible way?

Pick the smallest unit that:

Experiences the treatment consistently (exposure is well-defined).
Does not affect other units’ outcomes (no or acceptable interference).
Matches how outcomes are measured (analysis unit aligns with assignment or is properly modeled).

Mental model

Who is treated? The entity that actually experiences the variant.
Where can interference flow? Within device, across devices for a user, across people in a household, across stores in a region, across friends in a network.
What do we measure? Choose metrics at or above the assignment level to avoid bias or use models that account for clustering.
How does randomness happen? Deterministic hashing or random draws that assign each unit to treatment or control in a reproducible, sticky way.

Core techniques

User-level randomization: Best for logged-in, cross-device experiences.
Session/device/cookie-level randomization: Use only if exposure is session-bound and cross-session contamination is unlikely.
Cluster randomization (e.g., by household, store, city): Use when within-cluster spillovers are strong.
Blocked/stratified randomization: Balance important covariates (e.g., platform, country, pre-period activity) before assignment.
Sticky assignment: A unit always gets the same variant across the test window.
SRM checks: Sample Ratio Mismatch indicates randomization or tracking problems.

Worked examples

Example 1: Free shipping banner on an e-commerce site

Situation: Banner appears for logged-in users across web and app.

Unit: User account (to stay consistent across devices).
Randomization: Deterministic hash of user_id to buckets, 50/50 split.
Risks: Logged-out traffic. Mitigation: For anonymous users, either exclude from test or use cookie-level with short duration and clear analysis separation.
Outcome: Per-user conversion rate and revenue per user over test window.

Example 2: Notification send-time experiment

Situation: Compare two send schedules for push notifications.

Unit: User (schedules influence multiple days; per-session assignment would cross-contaminate).
Randomization: User-level hashing to Schedule A or B, sticky for entire test.
Outcome: Per-user opens, conversions, and opt-out rate during the test.
Note: Repeated measures per user are aggregated at user-level for analysis.

Example 3: Price change with potential arbitrage

Situation: Price differences may leak across users in the same location.

Unit: City (cluster randomization).
Randomization: Randomly assign cities to treatment/control with stratification by pre-period revenue and region.
Outcome: City-level revenue and units sold.
Trade-off: Fewer clusters reduce power; account for design effect and use pre-period covariates to improve precision.

How to decide your unit and randomization

Map exposure: Where does the treatment actually touch the user/system?
List interference paths: Same user across devices? Users influencing each other? Shared caches?
Choose the smallest safe unit: Avoid interference and ensure consistent exposure.
Make assignment sticky: Deterministic and reproducible through the whole test.
Balance covariates: Block/stratify on key variables (platform, country, activity).
Plan analysis: Aggregate to the assignment level or use appropriate clustered models.
Add guardrails: Monitor SRM and key health metrics.

Common mistakes and self-check

Mixing assignment and analysis units: Analyzing per-session when randomizing by user inflates Type I error. Self-check: Aggregate metrics at user-level if assignment is by user.
Non-sticky assignment: Users switch variants across sessions. Self-check: Verify a unit’s variant is constant over time.
Ignoring spillovers: Friends, households, or stores influence each other. Self-check: Sketch likely interference paths; consider cluster randomization.
No stratification: Imbalance on platform or country increases variance. Self-check: Compare pre-period covariates across groups before launch.
Undefined exposure window: Partial exposure leads to diluted effects. Self-check: Define inclusion criteria (e.g., active users during test period).
Sample Ratio Mismatch (SRM) not monitored: Assignment or tracking bugs go unnoticed. Self-check: Run chi-square test for expected allocation shares.
Underestimating cluster variance: Using individual-level formulas for cluster designs. Self-check: Apply design effect = 1 + (m − 1)ρ.
Forgetting repeated measures correlation: Per-event analysis pretends observations are independent. Self-check: Aggregate per unit or use cluster-robust methods.

Exercises

Complete these in order. Then check your answers below the tasks.

Exercise 1: Choose the unit and randomization

A music app tests a new playlist layout that persists across app and web. Some users are logged-in; some are anonymous. Define:

Your assignment unit(s) and handling for anonymous visitors.
How you ensure sticky assignment.
Primary analysis unit and key guardrails.

Hints

Think about cross-device consistency.
Decide whether to exclude or separately handle anonymous traffic.
Guardrails often include SRM and platform-level balance.

Show solution

Exercise 2: Cluster design effect

You randomize by city. Average users per city m = 500. Intra-cluster correlation ρ = 0.02. Compute the design effect and describe how it impacts required sample size.

Hints

Use design effect = 1 + (m − 1)ρ.
Design effect inflates variance; required N scales roughly by this factor.

Show solution

Compute: 1 + (500 − 1)*0.02 = 1 + 499*0.02 = 1 + 9.98 = 10.98.

Impact: You need about 10.98× the sample (or time) compared to an individual-level design for the same detectable effect size and power.

Exercise checklist

I stated a clear unit aligned with exposure.
I ensured sticky assignment and reproducibility.
I identified interference and justified cluster vs. individual randomization.
I aligned analysis with assignment or planned clustered methods.
I considered stratification and SRM monitoring.

Practical projects

Project 1: Write a one-page randomization plan for three scenarios: UI layout change (user-level), store signage change (store-level), regional pricing (city-level). Include unit, randomization, stratification, analysis unit, guardrails.
Project 2: Build a mock assignment table (in a spreadsheet) using a hash-like deterministic rule for 10,000 synthetic users; verify stickiness and 50/50 balance by platform.
Project 3: For a cluster test with ρ values {0.005, 0.02, 0.05} and m = 300, compute design effects and rewrite your power assumptions accordingly.

Learning path

Hypotheses and outcome metrics.
Randomization and unit selection (this page).
Blocking/stratification and guardrail metrics.
Power, MDE, and sample size with cluster adjustments when needed.
Execution playbook: instrumentation, SRM monitoring, and data QA.
Analysis: aggregation, variance estimation, and cluster-robust methods.
Sequential testing and test governance.
Advanced: network experiments and interference-aware designs.

Mini tasks

List two potential interference paths in your current product and how you would block them.
Draft a one-sentence exposure definition for your next experiment.
Pick one covariate to stratify on and explain why it matters for variance.

Quick test info

The quick test at the end is available to everyone; only logged-in users get saved progress.

Next steps

Apply these principles to your next planned experiment and document unit, randomization, and analysis alignment.
Prepare a short checklist your team can reuse before launching any test.
Move on to blocking/stratification and power analysis to tighten precision.

Mini challenge

Your marketplace launches a new “bundle discount” that can be seen by buyers and sellers. Buyers and sellers often interact repeatedly within a city. Propose:

The experimental unit (and why).
Your randomization approach (include stratification if any).
Primary analysis unit/metrics and how you will handle interference.
Guardrails and SRM plan.

Considerations

Cross-role spillovers (buyer-seller) and geographic clustering.
Design effect and number of clusters.
Pre-period covariate balance to improve precision.

Menu

Randomization And Unit Selection

Table of Contents