luvv to helpDiscover the Best Free Online Tools
Topic 5 of 8

Retention And Cohort Queries

Learn Retention And Cohort Queries for free with explanations, exercises, and a quick test (for Product Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

As a Product Analyst, you will routinely answer questions like: Do users who sign up in June come back a week later? How does week-4 retention look after a new onboarding flow? Which cohorts are churning fastest? Retention and cohort queries let you quantify customer stickiness, compare launches, and prioritize product bets.

  • Monitor Day-1 and Day-7 retention after a feature release.
  • Compare monthly cohorts to detect seasonality or channel quality.
  • Spot churn early and flag segments for lifecycle campaigns.

Concept explained simply

A cohort is a group of users who share a starting event and time bucket (for example, users who signed up in the same week). Retention measures how many of those users return (perform any qualifying activity) at a later time bucket.

Mental model

  • Pick the start: usually user signup (or first purchase).
  • Bucket into cohorts: by day/week/month (using DATE_TRUNC).
  • Define a return window: exact day (D7), week window (W4), or rolling window (any activity in days 1–30).
  • Count unique users who return in that window and divide by the cohort size.

Data model used in examples

We assume two tables (rename columns to your schema as needed):

users(user_id, signup_at TIMESTAMP)
events(user_id, event_at TIMESTAMP, event_name TEXT)

"Activity" usually means any meaningful product event (session, open_app, view_item, purchase). Adjust filters to match your product.

Worked examples

Example 1 — Monthly cohorts, Day-7 retention

Question: For each signup month, what percent of users return exactly 7 days after signup?

WITH base AS (
  SELECT 
    u.user_id,
    (DATE_TRUNC('month', u.signup_at))::date AS cohort_month,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
  -- optionally filter: WHERE e.event_name IN ('open_app','session','purchase')
)
SELECT 
  b.cohort_month,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END) AS retained_d7,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d7_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Notes: In some SQL dialects use DATEDIFF(day, b.signup_date, a.activity_date) = 7.

Example 2 — Weekly cohorts, Week-4 retention (days 21–27)

Question: What share of each weekly cohort is active in week 4 after signup (days 21–27 inclusive)?

WITH base AS (
  SELECT 
    u.user_id,
    DATE_TRUNC('week', u.signup_at)::date AS cohort_week,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT 
  b.cohort_week,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE 
    WHEN (a.activity_date - b.signup_date) BETWEEN 21 AND 27 THEN b.user_id END
  ) AS retained_w4,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 21 AND 27 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS w4_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;
Example 3 — Rolling 30-day retention (any activity days 1–30)

Question: For each signup month, what percent of users return at least once in the first 30 days?

WITH base AS (
  SELECT 
    u.user_id,
    DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT 
  b.cohort_month,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE 
    WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 30 THEN b.user_id END
  ) AS retained_30d,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 30 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d30_rolling_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Query patterns and templates

  • Cohort key: DATE_TRUNC('week'|'month', signup_at)
  • Day index: activity_date::date - signup_date::date (use DATEDIFF in some dialects)
  • Avoid double counting: COUNT(DISTINCT user_id) in both numerator and denominator
  • Filter meaningful events only to reduce noise
Template — daily grid (cohort x day)
WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
), joined AS (
  SELECT b.cohort_month, b.user_id,
         (a.activity_date - b.signup_date) AS day_n
  FROM base b
  JOIN acts a ON a.user_id = b.user_id
  WHERE a.activity_date >= b.signup_date
)
SELECT cohort_month,
       day_n,
       COUNT(DISTINCT user_id) AS active_users
FROM joined
WHERE day_n BETWEEN 0 AND 30
GROUP BY 1,2
ORDER BY 1,2;

Use this to build heatmaps (retention curves) in your BI tool.

Exercises

Try these in your environment. The quick test is at the end of the page. Test is available to everyone; only logged-in users get saved progress.

Exercise 1 — Day-7 retention by signup month

Compute D7 retention for each signup month using users and events. Treat any event as activity. Output: cohort_month, cohort_size, retained_d7, d7_retention_pct.

Show solution
WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT b.cohort_month,
       COUNT(DISTINCT b.user_id) AS cohort_size,
       COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END) AS retained_d7,
       ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END)
             / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d7_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Expected shape (example):

cohort_month | cohort_size | retained_d7 | d7_retention_pct
2024-06-01   | 1200        | 420         | 35.0
2024-07-01   | 950         | 320         | 33.7
2024-08-01   | 1100        | 385         | 35.0
  • Checklist: used DATE_TRUNC for cohorts
  • Used COUNT(DISTINCT) for users
  • Compared day index to 7 exactly
  • Handled divide-by-zero with NULLIF

Exercise 2 — Rolling 14-day retention

For each signup week, compute the percent of users who return at least once in the first 14 days (days 1–14). Output: cohort_week, cohort_size, retained_14d, d14_rolling_retention_pct.

Show solution
WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('week', u.signup_at)::date AS cohort_week,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT b.cohort_week,
       COUNT(DISTINCT b.user_id) AS cohort_size,
       COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 14 THEN b.user_id END) AS retained_14d,
       ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 14 THEN b.user_id END)
             / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d14_rolling_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;
  • Checklist: cohort by week
  • Rolling window 1–14 days
  • Distinct users in numerator and denominator

Common mistakes

  • Counting events, not users. Fix: COUNT(DISTINCT user_id) for retention.
  • Using the wrong window. Be explicit: exact day (== 7) vs rolling window (BETWEEN 1 AND 7).
  • Timezone drift. Align timestamps to a single timezone and cast to ::date after.
  • Leaky cohorts. Ensure each user has one cohort (based on first signup). If multiple signups exist, use MIN(signup_at) per user.
  • Including pre-signup events. Filter activity_date >= signup_date.
Self-check tips
  • Spot-check a few users manually to verify day indices.
  • For fresh cohorts, retention on late days should be near zero (insufficient time). If not, windows are wrong.
  • Cohort size should equal DISTINCT users starting in that period, regardless of events.

Practical projects

  • Build a monthly cohort heatmap (0–60 days) and annotate feature launch dates.
  • Split D7 retention by acquisition channel (add channel to users) and compare.
  • Create a dashboard: D1, D7, D30 rolling retention for total and by platform (web/mobile).

Who this is for

  • Product Analysts who need to quantify user stickiness and compare cohorts.
  • Data/BI Analysts supporting growth, lifecycle, or product teams.

Prerequisites

  • Comfort with SQL SELECT, GROUP BY, JOIN.
  • Familiarity with date functions (DATE_TRUNC, casting to date, DATEDIFF equivalents).
  • Basic understanding of user-event schemas.

Learning path

  • Before this: Event schema basics, window and date functions.
  • This lesson: Cohorts, fixed-day and rolling retention.
  • Next: Segment-level retention (channel, geo), survival curves, and funnel-to-retention analyses.

Next steps

  • Parameterize your retention window (1, 7, 30) for easy comparisons.
  • Add segments (platform, plan, country) and watch how cohorts diverge.
  • Publish a weekly retention report with a short narrative of changes.

Mini challenge

Find the first cohort where D7 retention drops by more than 5 percentage points versus the previous cohort. List the cohort pair and the top 3 segments contributing to the drop.

Quick Test

Take the quick test below to check your understanding. Anyone can take it; only logged-in users will have progress saved.

Practice Exercises

2 exercises to complete

Instructions

Compute D7 retention for each signup month using users and events. Treat any event as activity. Output: cohort_month, cohort_size, retained_d7, d7_retention_pct.
Expected Output
cohort_month | cohort_size | retained_d7 | d7_retention_pct 2024-06-01 | 1200 | 420 | 35.0 2024-07-01 | 950 | 320 | 33.7 2024-08-01 | 1100 | 385 | 35.0

Retention And Cohort Queries — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Retention And Cohort Queries?

AI Assistant

Ask questions about this tool