Topic Not Found

Who this is for

Data Scientists who plan A/B tests, AA tests, or model rollouts.
Analysts and Product Managers converting ideas into measurable experiments.
Engineers who need crisp, testable specs for changes.

Prerequisites

Basic statistics: proportions/means, p-values, confidence intervals.
Familiarity with key metrics (conversion rate, CTR, retention).
Fundamentals of A/B testing (exposure, randomization, sample size).

Why this matters

In real data science work you will:

Turn vague requests like “boost engagement” into testable statements.
Choose a primary metric and guardrails before any code changes.
Prevent p-hacking by defining direction, audience, and thresholds upfront.
Communicate clearly with stakeholders and speed up approvals.
Reduce wasted experiments and make your results decision-ready.

Concept explained simply

A hypothesis is a clear, testable statement about how a change will affect a metric for a specific audience within a timeframe, and why.

Use this format:

Because [mechanism], changing [X] for [audience] will [increase/decrease/no change] [primary metric] by [Δ or at least Δ] within [timeframe], without worsening [guardrail metrics] beyond [limits].

You’ll state both:

H0 (Null): No effect (or effect ≤ threshold).
H1 (Alternative): Effect in the stated direction (or ≥ threshold).

Mental model: SMART-HG

Subject & Scope: who/where the change applies.
Mechanism: why the change should work (the causal story).
Action: what exactly is changing (the treatment).
Result metric: one primary metric tied to the goal.
Threshold & Time: minimum detectable improvement and evaluation window.
H0/H1: explicit null and alternative hypotheses.
Guardrails: must-not-worsen metrics with limits.

Tip: Directional vs. two-sided

Use a one-sided hypothesis when your decision rule is asymmetric (e.g., you’ll only ship if it increases conversion). Use two-sided when any change (up or down) matters (e.g., latency must be stable).

Worked examples

Example 1 — E-commerce checkout button color

Hypothesis (H1): Because higher contrast improves visual salience, changing the checkout button from gray to green for all desktop users will increase purchase conversion by at least 1 percentage point (absolute) over 2 weeks, without increasing refund rate above 0.5% or decreasing average order value by more than 1%.

H0: Conversion increase ≤ 1 pp, or guardrails violated.

Primary metric: Purchase conversion (session → order).
Audience: Desktop web, all geographies.
Timeframe: 2 weeks.
MDE: 1 pp absolute.
Guardrails: Refund rate, AOV.

Why this is framed well

Includes mechanism (salience).
Sets direction and minimal threshold (decision rule).
Defines scope (desktop) and guardrails.

Example 2 — Recommendation ranking model

Hypothesis (H1): Because the new ranking model increases relevance via better user-item embeddings, replacing the current ranker for logged-in mobile users will increase homepage CTR by 3–5% relative within 3 weeks, without increasing bounce rate by more than 0.5 pp or increasing average latency above 50 ms.

H0: CTR lift < 3% or guardrails violated.

Primary metric: Homepage CTR.
Audience: Logged-in mobile users only.
Timeframe: 3 weeks.
MDE: 3% relative.
Guardrails: Bounce rate, latency.

Why this is framed well

Aligns to product goal (engagement) with relevance mechanism.
Sets explicit latency guardrail for user experience.
Segmented audience reduces noise and focuses impact.

Example 3 — Pricing page copy

Hypothesis (H1): Because clarifying benefits reduces confusion, replacing jargon with plain-language bullets on the pricing page for new visitors will decrease exit rate by at least 5% relative over 1 week, without reducing free-trial starts by more than 1%.

H0: Exit rate reduction < 5% or free-trial starts drop > 1%.

Primary metric: Pricing page exit rate.
Guardrail: Free-trial start rate.
Assumption: Traffic sources remain stable.

Why this is framed well

States a plausible mechanism (clarity).
Protects downstream conversion via guardrail.
Short evaluation window suited for high-traffic page.

Step-by-step: framing a strong hypothesis

Define outcome — Pick one primary metric tightly tied to the decision.
Specify audience — Segment by platform, geography, user state, or funnel stage.
Articulate mechanism — The causal reason this change should move the metric.
Set threshold — Minimum effect size worth shipping (MDE) and direction.
Add guardrails — Metrics that must not degrade beyond set limits.
Choose timeframe — Enough time to capture behavior and stabilize variance.
Write H0/H1 — Make them falsifiable and tied to the decision rule.
Pre-commit — Record hypothesis before running the experiment.

Checklist — does your hypothesis pass?

Primary metric is single, decision-aligned, and measurable.
Audience/segment is explicit.
Direction and threshold are stated (or two-sided if required).
Mechanism is plausible and specific.
Guardrails and limits are defined.
Timeframe is realistic for traffic and behavior cycles.
H0/H1 are explicit and testable.

Exercises

Practice here, then compare with solutions. Everyone can do the exercises; only logged-in users will have their progress saved.

Exercise 1 — Rewrite vague goals as testable hypotheses

Take each vague statement and rewrite it using the template and SMART-HG. Then state H0 and H1.

“Improve onboarding.”
“Reduce churn.”
“Make search better.”

Hints

Pick one primary metric per statement.
Specify the audience and timeframe.
Add a minimum detectable effect and guardrails.

Exercise 2 — Define metrics, guardrails, and thresholds

Scenario: You will add personalized subject lines to marketing emails for active subscribers.

Choose a primary metric and 1–2 guardrails.
State a plausible mechanism.
Write H0/H1 with direction and threshold.
Pick an evaluation window.

Hints

Marketing emails often optimize open rate or downstream conversion.
Guardrails might include unsubscribe rate or spam complaints.
Short windows can work if you send at scale weekly.

Self-check after exercises

Can someone reading your hypothesis run the test without extra clarifications?
Is the decision rule obvious from the threshold and guardrails?
Would you accept “no ship” if results don’t meet H1? If not, refine.

Common mistakes and self-check

Too many primary metrics: Pick one; others are guardrails or secondary.
Vague audience: Name the platform, user type, geography, or funnel stage.
No mechanism: Add the causal story; it guides diagnostics.
No threshold: Without MDE you can’t decide to ship or stop.
Missing guardrails: Prevent harmful trade-offs (e.g., CTR vs. latency).
Open-ended timeframe: Predefine the window to avoid peeking bias.

Quick self-audit

Can H0/H1 be falsified with planned data?
Is the metric stable enough in the chosen window?
Are seasonal or campaign confounders controlled?

Practical projects

Audit 5 past experiments: rewrite each hypothesis with SMART-HG and note what changed in decisions.
Create a hypothesis library: 10 ideas mapped to metrics, thresholds, and guardrails for your product area.
Run a simulated A/A test plan with a fully pre-registered hypothesis and “no effect” decision rule.

Learning path

Before: Metrics design, randomization basics, power/MDE intuition.
Now: Hypothesis framing (this lesson) — tie ideas to measurable outcomes and guardrails.
Next: Sample size and power, experiment execution, effect interpretation, and iteration planning.

Next steps

Convert one live idea into a SMART-HG hypothesis and review with your team.
Pre-register the hypothesis in your experiment doc before implementation.
Take the quick test below to check retention.

Mini challenge

Write a one-sentence hypothesis for a navigation redesign on mobile that targets task completion. Include audience, mechanism, metric, threshold, timeframe, and one guardrail.

Quick Test

Everyone can take the test. Only logged-in users will have their progress saved.

Menu

Hypothesis Framing

Table of Contents

Who this is for

Prerequisites

Why this matters

Concept explained simply

Mental model: SMART-HG

Worked examples

Example 1 — E-commerce checkout button color

Example 2 — Recommendation ranking model

Example 3 — Pricing page copy

Step-by-step: framing a strong hypothesis

Exercises

Exercise 1 — Rewrite vague goals as testable hypotheses

Exercise 2 — Define metrics, guardrails, and thresholds

Common mistakes and self-check

Practical projects

Learning path

Next steps

Mini challenge

Quick Test

Practice Exercises

Rewrite vague goals as testable hypotheses

Instructions

Expected Output

Define metrics, guardrails, and thresholds for personalized email subject lines

Hypothesis Framing — Quick Test

Have questions about Hypothesis Framing?

AI Assistant