luvv to helpDiscover the Best Free Online Tools
Topic 12 of 13

Exploratory Hypothesis Generation

Learn Exploratory Hypothesis Generation for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Who this is for

Data Analysts who want to turn messy observations into clear, testable ideas that guide analysis, experiments, and stakeholder decisions.

Prerequisites

  • Basic descriptive stats (mean, median, rate, proportion).
  • Comfort exploring data in SQL, spreadsheets, or Python/R.
  • Familiarity with key business metrics (conversion, retention, revenue).

Why this matters

In real projects, you rarely get a clear question. You spot a pattern, need to explain it, and decide what to try next. Hypothesis generation bridges observation and action. You will use it to:

  • Pinpoint why a metric moved (e.g., conversion dip after a release).
  • Design experiments with clear expected outcomes.
  • Prioritize work when there are many plausible causes.
  • Communicate assumptions and decision logic to non-technical teammates.

Concept explained simply

A hypothesis is a concise, testable guess that links a cause to an expected effect on a specific metric and segment.

Use this template:

If we change/do/observe [cause/lever] for [segment], then [metric] will [direction/amount] because [mechanism], measured by [method] over [time window].

Mental model

  • Observe: Notice a pattern, anomaly, or trend.
  • Explain: Propose a mechanism (why it could happen).
  • Specify: Turn it into a testable statement with metric, segment, and time.
  • Assumptions: List what must be true for it to hold.
  • Probe: Run a quick check (slice, plot, backtest, or experiment).
Good vs. weak hypothesis (quick comparison)
  • Weak: "Mobile looks worse than desktop." (observation, not testable)
  • Good: "If we fix checkout address auto-fill on mobile for US visitors, purchase conversion will increase by 3–5% because form friction is highest on mobile, measured by session conversion within 14 days."

Quick workflow

  1. Frame the metric and context: What metric, what time span, which segment?
  2. List observations: 3–5 facts from the data (no explanations yet).
  3. Generate 5–10 hypotheses using the If–Then–Because template.
  4. Check assumptions with a fast slice or plot; drop ideas that fail basic checks.
  5. Prioritize with ICE/PIE scores (Impact, Confidence, Effort).
  6. Plan tests: Define measurement, success criteria, and time window.

Worked examples

Example 1: Signup conversion dropped 4% week-over-week

Observations

  • Mobile signup conversion fell from 52% to 48%; desktop stable.
  • Traffic mix unchanged; page load time +300ms on mobile.
  • Release notes show a new email verification step added.

Hypotheses

  • H1: If we make email verification optional on mobile for first login, signup conversion will recover by 3–4% because extra steps increase abandonment, measured as signup completion rate over 14 days.
  • H2: If we optimize mobile page assets to reduce load time by 300ms, conversion will increase by 1–2% because latency impacts form completion.
  • H3: If we move email verification after profile setup, conversion will increase by 2–3% because users perceive immediate value before friction.

Quick checks

  • Slice by device and step: abandonments spike at verification step on mobile.
  • Latency vs. conversion scatter: modest negative correlation (r ≈ -0.2).

Prioritization (ICE): H1 (9,7,6)=7.3; H2 (5,6,5)=5.3; H3 (7,6,6)=6.3 → Try H1 first.

Example 2: Churn increased for monthly subscribers

Observations

  • Monthly churn up from 4.0% to 5.5% in Q2; annual stable.
  • Support tickets about billing date conflicts increased.
  • Usage for new monthly users dips in week 2.

Hypotheses

  • H1: If we shift billing retries from weekends to weekdays, churn will drop by 0.5–1.0 pp because banks decline more weekend charges, measured by successful recoveries over 30 days.
  • H2: If we add a week-2 usage nudge, churn will fall by 0.3–0.6 pp because low early engagement predicts cancellation.

Quick checks

  • Weekend retry success 9% lower than weekdays.
  • Users with 0 sessions in week 2 churn 3x more than active users.
Example 3: A/B test shows +8% CTR but no conversion lift

Observations

  • Variant drives more clicks; downstream checkout conversion unchanged.
  • Time on landing lower for variant clickers.

Hypotheses

  • H1: If the new CTA promises a discount inconsistently, clicks rise but intent is low; aligning copy on the landing page will raise add-to-cart by 2%.
  • H2: If we qualify clicks with clearer expectations in the CTA, CTR will drop slightly but net purchases will increase by 1–2%.

Measurement: Add-to-cart rate for variant clickers within session, 2-week window.

Practical projects

  • Project 1: Funnel leak finder
    • Pick a 4-step funnel (e.g., visit → product view → add-to-cart → purchase).
    • Write 5 hypotheses explaining the biggest drop-off.
    • Run at least 2 quick checks (segment slice, time-of-day, device).
    • Prioritize top 2 and outline a test plan.
  • Project 2: Retention day-7 deep dive
    • Compute D7 retention by cohort and source.
    • Generate 5 hypotheses on why a source underperforms.
    • Backtest one hypothesis with historical cohorts.
  • Project 3: Performance vs. UX tradeoff
    • Plot page load p95 vs. conversion by device.
    • Formulate hypotheses for improving both metrics.
    • Draft a low-effort change (e.g., image compression) and expected impact.

Exercises

These mirror the graded exercises below. Write your answers, then compare with the sample solutions.

Exercise 1: Craft testable hypotheses from observations

Given

  • Mobile checkout conversion = 38% (previously 45%); desktop = 52% (unchanged).
  • Average mobile load time increased by 400ms last week.
  • New address validation step added to mobile checkout.
  • Most decline from US visitors; international stable.

Task: Write 3 testable hypotheses using: If [cause] for [segment], then [metric][direction/amount] because [mechanism], measured by [method][time].

Show solution
  • If we bypass address validation for returning US mobile users, purchase conversion will increase by 4–6% because repeat buyers already have valid addresses, measured by session purchase rate over 14 days.
  • If we reduce mobile checkout load time by 300ms, conversion will increase by 2–3% because latency raises abandonment, measured via variant vs. control conversion in an A/B test for 2 weeks.
  • If we move address validation post-purchase (fraud check before fulfillment), conversion will increase by 3–4% because friction occurs before commitment, measured by checkout completion rate over 14 days.
Exercise 2: Prioritize with ICE scoring

Context: Limited engineering capacity allows only one change this sprint.

Hypotheses

  • H1: Defer email verification until first login.
  • H2: Compress hero images to cut mobile p95 load by 500ms.
  • H3: Add tooltip help on credit card CVC field.
  • H4: Remove optional marketing consent from the main flow.
  • H5: Add Apple Pay for iOS users.

Task: Assign Impact, Confidence, Effort (1–10; higher effort = harder). Compute ICE = (Impact × Confidence) / Effort and rank.

Show solution

Sample scoring (yours may vary, but rationale matters):

  • H1: I=8, C=7, E=6 → ICE=9.3
  • H2: I=6, C=7, E=4 → ICE=10.5
  • H3: I=3, C=6, E=2 → ICE=9.0
  • H4: I=4, C=8, E=2 → ICE=16.0
  • H5: I=7, C=5, E=8 → ICE=4.4

Rank: H4 > H2 > H1 ≈ H3 > H5. Choose H4 first due to high ROI and low effort.

  • Checklist before finalizing a hypothesis:
    • States a clear cause/lever and mechanism (Because...).
    • Names a metric, segment, and time window.
    • Specifies expected direction or magnitude.
    • Test method is feasible (slice, backtest, or experiment).
    • Assumptions are listed and quickly checkable.

Common mistakes and how to self-check

  • Confusing observation with hypothesis: If it lacks a cause or test, it is not a hypothesis. Add "because" and a measurable outcome.
  • Too broad: Narrow the segment/time. Ask: who, where, when?
  • No mechanism: Without a why, you cannot learn if it fails. Add the mechanism and list assumptions.
  • Unmeasurable outcome: Replace vague "improve" with a specific metric and direction.
  • Ignoring feasibility: If effort is huge, consider a proxy test or backtest first.

Mini challenge

You see a 10% increase in add-to-cart rate on weekends for mobile traffic, but purchase rate is flat. Write two alternative hypotheses that could both be true yet suggest different next steps. Include measurement and time windows.

Learning path

  • Before: Data cleaning, descriptive stats, exploratory plots/slicing.
  • Now: Hypothesis generation and prioritization (this subskill).
  • Next: Experiment design and causal inference basics; metric design and guardrails; communicating insights and recommendations.

Next steps

  • Pick one live metric you monitor. Generate 5 hypotheses, run 2 fast checks, and prioritize 1 to action.
  • Draft a simple test plan with success criteria and timeline.
  • Take the quick test below to validate understanding. Note: Everyone can take it for free; only logged-in users will have progress saved.

Practice Exercises

2 exercises to complete

Instructions

Given

  • Mobile checkout conversion = 38% (previously 45%); desktop = 52% (unchanged).
  • Average mobile load time increased by 400ms last week.
  • New address validation step added to mobile checkout.
  • Most decline from US visitors; international stable.

Task: Write 3 testable hypotheses using: If [cause] for [segment], then [metric][direction/amount] because [mechanism], measured by [method][time].

Expected Output
Three specific, testable hypotheses including cause, segment, metric, expected direction/amount, mechanism, method, and time window.

Exploratory Hypothesis Generation — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Exploratory Hypothesis Generation?

AI Assistant

Ask questions about this tool