Who this is for
You’re a Product Analyst or aspiring one who needs to measure how well users come back, compare cohorts over time, and spot retention issues fast.
Prerequisites
- Basic SQL or spreadsheet skills (filters, COUNT, pivot tables).
- Understanding of active user definitions (DAU/WAU/MAU).
- Comfort with time windows (D1/D7/W4, etc.).
Why this matters
In real product work you will:
- Build weekly acquisition cohorts and track week-over-week retention.
- Compare retention by channel, campaign, device, or user segment.
- Evaluate a new feature’s impact on D7/D30 retention.
- Diagnose churn spikes after releases and propose fixes.
Concept explained simply
Retention answers: After people start using our product, how many come back later? Cohorts group users by a shared starting event (e.g., signup week) so you can compare apples to apples across time.
Mental model
- Think of a leaky bucket: acquisition fills it; retention shows how much water stays. Fix holes before pouring more water.
- Retention curve: starts high then drops and flattens at a habit level. Your goal is a higher and earlier flattening.
- One number is not enough. Always segment (by channel, country, device, feature adoption).
Key formulas (quick reference)
Classic N-day retention (a.k.a. exact day):
Retention_N = (Users from cohort active on exact day N) / (Users in cohort at start)
Rolling N-day retention (active on day N or later within N):
RollingRetention_N = (Users from cohort active on day N or any day after, within N-day window) / (Cohort size)
Churn:
Churn_N = 1 - Retention_N
Week-based retention (exact week):
Retention_WeekK = (Users active in week K after start) / (Cohort size)
New vs reactivated users in denominator:
Use only users who entered the cohort (e.g., first signup week). Reactivations later do not change cohort size.
Setting up a cohort analysis
- Pick a start event: most commonly first signup/first purchase/first app open.
- Choose a cohort period: weekly (most common) or monthly if volume is low.
- Define active: e.g., any session, key action, or purchase within the period.
- Choose retention type: classic (exact) or rolling; be consistent across reports.
- Build the cohort table: rows = cohorts, columns = periods after start (D1, D7, D30 or W1, W2...).
- Segment and compare: channel, country, device, feature adoption, plan.
Simple SQL skeleton (illustrative)
-- Assumptions: events(user_id, event_date, is_active_event), users(user_id, signup_date)
WITH cohorts AS (
SELECT user_id,
DATE_TRUNC('week', signup_date) AS cohort_week
FROM users
), activity AS (
SELECT e.user_id,
DATE_TRUNC('week', u.signup_date) AS cohort_week,
DATE_TRUNC('week', e.event_date) AS activity_week,
EXTRACT(WEEK FROM e.event_date) - EXTRACT(WEEK FROM u.signup_date) AS weeks_since
FROM events e
JOIN users u USING (user_id)
WHERE e.is_active_event = true
)
SELECT cohort_week,
COUNT(DISTINCT c.user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN weeks_since = 0 THEN a.user_id END) AS w0_active,
COUNT(DISTINCT CASE WHEN weeks_since = 1 THEN a.user_id END) AS w1_active,
COUNT(DISTINCT CASE WHEN weeks_since = 2 THEN a.user_id END) AS w2_active
FROM cohorts c
LEFT JOIN activity a ON c.user_id = a.user_id AND c.cohort_week = a.cohort_week
GROUP BY 1
ORDER BY 1;
Convert counts to percentages by dividing by cohort_size.
Worked examples
Example 1: D7 (classic) retention
Cohort size = 1,000 users who signed up in Week 10. Users active exactly in Week 11 (one week later) = 250.
- D7 (W1) retention = 250 / 1,000 = 25%.
- Churn by W1 = 1 - 0.25 = 75%.
Example 2: Classic vs rolling
Same cohort of 1,000 users. Users active on day 7 exactly = 180. Users who were inactive on day 7 but active on days 1–7 at least once = additional 140.
- Classic D7 = 180 / 1,000 = 18%.
- Rolling D7 = (180 + 140) / 1,000 = 32%.
- Insight: rolling is always ≥ classic and better for apps used any day in the window; classic suits products with fixed cadences.
Example 3: Channel cohort comparison
Two weekly cohorts, same size (500 each):
- Channel A: W1=30%, W4=18%.
- Channel B: W1=24%, W4=19%.
Interpretation: A onboards better (higher early retention), B sustains slightly better by W4. Consider improving A’s long-term habit loops or optimize B’s onboarding.
Interpreting retention curves
- Sharp early drop then flat: onboarding/activation issue; later experience is stable.
- Gradual continuous decline: weak habit formation; need recurring value/notifications.
- Seasonality wiggles: compare same weekday/week of year; use 4-week rolling averages.
Self-check checklist
- I defined a clear cohort start event and period.
- I picked classic vs rolling and stated it on the chart.
- My denominator is fixed (cohort size) and not changing over time.
- I segmented by at least one dimension (channel/country/device/feature).
- I looked at both early and long-term retention (e.g., D1, D7, D30 or W1, W4, W8).
Common mistakes and how to avoid them
- Mixing classic and rolling on the same chart. Fix: label clearly and use one method.
- Changing denominators over time. Fix: cohort size is fixed; do not add reactivations to the denominator.
- Using DAU as “active” for a weekly cohort without week aggregation. Fix: align activity window to cohort window.
- Ignoring timezone and event lag. Fix: standardize timestamps and define day/week cutoffs.
- Averaging percentages across cohorts without weighting by cohort size. Fix: compute weighted averages.
Practical projects
- Spreadsheet: simulate 5 weekly cohorts of 1,000 users with declining activity probabilities; build a cohort matrix (W0–W8) and plot curves.
- SQL: create acquisition cohorts by channel; report W1 and W4 retention and rank channels by delta.
- Feature adoption cohort: cohort = week of first using Feature X; compare retention against non-adopters.
Exercises
These mirror the tasks in the Exercises panel below. Do them here first, then record your answers in the exercises section if you want to track progress. Note: Everyone can access exercises and the quick test; only logged-in users have their progress saved.
Exercise 1: Compute D1, D7, D30 (classic)
Dataset (Week 12 cohort size = 2,000):
- Active on Day 1 exactly: 1,100 users
- Active on Day 7 exactly: 520 users
- Active on Day 30 exactly: 260 users
Task: Calculate D1, D7, D30 retention and churn values.
Hint
Retention_N = returning users on exact day N / 2,000. Churn_N = 1 - Retention_N.
Exercise 2: Build a weekly cohort table
You have three cohorts of equal size (1,000 each). Active users per week are:
- Cohort A (Week 20): W1=350, W2=260, W4=200
- Cohort B (Week 21): W1=300, W2=240, W4=210
- Cohort C (Week 22): W1=380, W2=270, W4=195
Task: Compute W1, W2, W4 retention (%) for each cohort and identify which has best early and best later retention.
Hint
Divide by cohort size; compare W1 for early, W4 for later retention.
Learning path
- Before: Activation metrics, DAU/WAU/MAU definitions.
- Now: Retention metrics and cohorts (this lesson).
- Next: Habit loops, engagement depth metrics, reactivation analysis, and lifecycle segmentation.
Next steps
- Automate a weekly cohort report and share a one-line insight each week.
- Create a retention dashboard with trend lines and segment filters.
- Run an experiment; measure impact on W1 and W4 retention.
Mini challenge
Your D1 is 55% but D7 is 15% and D30 is 8%. Name one onboarding change and one habit-forming change you’d test to lift D7 without hurting D1. Write a 3-bullet plan with success metrics (W1/W4 retention and a key action frequency).
Quick Test info
There is a short test for this subskill. Anyone can take it. Only logged-in users will have results saved to their learning progress.