Why this matters
Business Analysts often translate ideas into measurable tests. A clear experiment brief aligns product, design, engineering, data, and leadership on the exact question, how success is measured, risks, and what decision will be made. It saves time, reduces rework, and makes results trustworthy.
- Real tasks: size impact of UX changes, evaluate price/plan experiments, test onboarding flows, and prioritize next iterations based on evidence.
- Without a brief: unclear metrics, moving goalposts, invalid results, and stakeholder misalignment.
Concept explained simply
An experiment brief is a short document that answers: What will we change? For whom? What outcome will move? How will we decide? Under what constraints?
Mental model: Contract + Map.
- Contract: a falsifiable claim and decision rule everyone agrees on before launch.
- Map: the scope, risks, and steps so execution and analysis are smooth.
Copy-ready experiment brief template
Open the template
Title: [Short, specific] Owner: [Name] Stakeholders: [Teams/people] Date: [YYYY-MM-DD] 1) Hypothesis (falsifiable) We believe that [change] for [audience] will [increase/decrease] [primary behavior metric] by about [magnitude or MDE] within [time window] because [why]. 2) Variants and Assignment Control: [current] Treatment(s): [describe] Unit of randomization: [user/account/session] Eligibility rules: [who is in/out] 3) Metrics Primary: [single north-star metric] Secondary (diagnostic): [supporting] Guardrails (must not worsen): [e.g., error rate, churn, complaints] 4) Decision Rule (pre-commit) If the treatment changes the primary metric by at least [MDE] in the desired direction with [alpha=0.05, power≈80%] and guardrails do not degrade by more than [threshold], then [ship/rollout plan]. Otherwise [do not ship/iterate]. 5) Sample Size & Duration (estimate) Baseline: [value] MDE: [value] Estimated sample/variant: [rough calc] Expected runtime: [days/weeks], accounting for seasonality/traffic. 6) Analysis Plan Method: [frequentist/Bayesian], test type [t-test/prop test], segments [if any], handling of outliers/missing data, intent-to-treat vs. per-protocol. 7) Scope & Execution Surfaces affected: [pages, app areas] Instrumentation: [events, properties] Data sources: [warehouse tables, logs] QA checklist: [key checks] 8) Risks & Ethics Risks: [user confusion, performance] Mitigations: [kill switch, phased rollout] Privacy & consent: [notes] 9) Dependencies & Timeline Dependencies: [design/dev/ops] Milestones: [ready, launch, analyze, decision] 10) Expected Impact & Next Steps If positive: [rollout path] If null/negative: [follow-up tests or stop]
Step-by-step: write your brief
- Define the behavior: Choose one primary metric that reflects the target behavior (e.g., completed sign-ups).
- State the hypothesis: Change + Audience + Direction + Metric + Magnitude + Time.
- Pick guardrails: What must not get worse (e.g., error rate, refund rate)?
- Decide the decision rule: Thresholds and what you will do given each outcome.
- Scope and assignment: Who is eligible and how you randomize.
- Plan analysis: Methods, segments, and how to handle data quality.
- Estimate duration: Sanity-check that traffic supports your MDE.
- Address risks: Kill switch, phased rollout, ethics.
Worked examples (3)
Example 1: Address autocomplete in checkout
Hypothesis: Adding address autocomplete for logged-in users will increase checkout completion rate by ~3% within 2 weeks because it reduces friction and typos.
Primary metric: Checkout completion rate per eligible session.
Guardrails: Payment error rate, support tickets mentioning address issues.
Decision rule: Ship if completion rate increases by ≥3% (p < 0.05) and guardrails do not worsen by >0.2pp.
Example 2: Shorter onboarding (5 steps → 3 steps)
Hypothesis: Reducing onboarding steps for new SMB accounts will increase D7 activation by 5% because fewer steps lower drop-off.
Primary metric: D7 activation rate (performed core action at least once by day 7).
Guardrails: Support contacts per new account, refund rate in first 30 days.
Decision rule: Roll out if D7 activation increases by ≥5% with no guardrail degradation >10%.
Example 3: New pricing page layout
Hypothesis: Reordering plan cards to highlight the recommended plan for self-serve visitors will increase paid conversion by 1pp within 3 weeks due to clearer value signals.
Primary metric: Paid conversion rate from pricing page sessions.
Guardrails: Average revenue per paying user (ARPPU), refund requests, page load time.
Decision rule: Ship if +1pp or more with stable ARPPU (±2%) and no significant performance regressions.
Metrics and decision rules
- Primary metric: One metric to rule the decision. Keeps focus.
- Secondary: Explain why the primary moved (diagnostics).
- Guardrails: Protect user experience and revenue; predefine thresholds.
- Decision rule: If effect ≥ MDE in desired direction and guardrails okay → ship; else don’t ship/iterate.
Sample size & duration (quick rule-of-thumb)
Open a simple estimator
For rate metrics, rough per-variant sample ≈ 16 × p × (1 − p) / (MDE²), where p is baseline rate and MDE is absolute lift (e.g., 0.03 for 3pp). Assumes ~80% power, 5% alpha. Treat as approximate.
- Example: p = 0.20, MDE = 0.03 → 16 × 0.2 × 0.8 / 0.0009 ≈ 2844 observations per variant.
- If traffic is low, increase runtime, aggregate surfaces, or target larger MDE.
Risks, ethics, and rollout
- Have a kill switch and monitoring for anomalies.
- Avoid misleading users; respect privacy and consent norms.
- Start with a small ramp (e.g., 10% → 50% → 100%) if risk is higher.
Exercises (practice inside your brief)
Complete these mini tasks directly in your notes using the template above.
Exercise 1 — Draft the core hypothesis and metrics
Scenario: We plan to add address autocomplete in checkout for logged-in users.
- Write a one-sentence falsifiable hypothesis.
- Select one primary metric and two guardrails.
- Pre-commit a decision rule.
Expected output: hypothesis, primary metric, guardrails, decision rule.
Exercise 2 — Choose guardrails and a decision rule
Scenario: Reduce onboarding steps from 5 to 3 for SMB accounts.
- Pick relevant guardrails.
- Write a decision rule that references an MDE.
Expected output: list of guardrails + decision rule sentence.
Checklist before you run
- ☐ Hypothesis is falsifiable and specific (change + audience + metric + direction + magnitude + time).
- ☐ One primary metric; guardrails with thresholds pre-set.
- ☐ Eligibility and randomization unit are clear.
- ☐ Analysis plan and decision rule are pre-committed.
- ☐ Estimated duration is feasible with available traffic.
- ☐ Instrumentation and QA steps are listed.
- ☐ Risks, ethics, and kill switch are addressed.
Common mistakes and self-check
- Vague goals: Fix by writing a single-sentence hypothesis with numbers.
- Metric shopping: Fix by pre-committing primary and guardrails.
- Underpowered tests: Fix by increasing MDE or runtime.
- Scope creep: Fix by locking surfaces and eligibility in the brief.
- Ignoring risks: Fix by adding guardrails and a rollback plan.
Self-check: Could someone else run this test correctly using only your brief? If not, clarify.
Who this is for
- Business Analysts and product team members who run or support experiments.
- Anyone moving from ad-hoc tests to repeatable, trustworthy experimentation.
Prerequisites
- Basic understanding of A/B testing concepts (control vs. treatment, significance).
- Comfort with product metrics (conversion, activation, retention).
Learning path
- Write a one-sentence hypothesis (this lesson).
- Select primary and guardrail metrics with thresholds.
- Define assignment, eligibility, and scope.
- Pre-commit analysis plan and decision rule.
- Run QA and document risks.
- Launch, analyze, decide, and document learnings.
Practical projects
- Create a brief for a small UX change (e.g., button copy) and a bigger flow change (e.g., onboarding).
- Run a retrospective: take a past test and rewrite the brief to fix gaps. Note how the decision might change.
- Build a guardrail catalog for your product (candidate metrics with typical thresholds).
Next steps
- Pick one live idea and convert it into a brief using the template.
- Share with stakeholders to pressure-test the decision rule before any build work begins.
- Schedule a short pre-launch review to lock the brief.
Mini challenge
Scenario: Your mobile app adds a prominent "Try Premium" banner on the home screen for free users. Draft 3 parts of the brief: the hypothesis sentence, one primary metric, and two guardrails. Keep it crisp and measurable.
Quick Test
Everyone can take the test; only logged-in users get saved progress.