How to learn Retention And Cohort Queries for SQL in Product Analyst for free

Why this matters

As a Product Analyst, you will routinely answer questions like: Do users who sign up in June come back a week later? How does week-4 retention look after a new onboarding flow? Which cohorts are churning fastest? Retention and cohort queries let you quantify customer stickiness, compare launches, and prioritize product bets.

Monitor Day-1 and Day-7 retention after a feature release.
Compare monthly cohorts to detect seasonality or channel quality.
Spot churn early and flag segments for lifecycle campaigns.

Concept explained simply

A cohort is a group of users who share a starting event and time bucket (for example, users who signed up in the same week). Retention measures how many of those users return (perform any qualifying activity) at a later time bucket.

Mental model

Pick the start: usually user signup (or first purchase).
Bucket into cohorts: by day/week/month (using DATE_TRUNC).
Define a return window: exact day (D7), week window (W4), or rolling window (any activity in days 1–30).
Count unique users who return in that window and divide by the cohort size.

Data model used in examples

We assume two tables (rename columns to your schema as needed):

users(user_id, signup_at TIMESTAMP)
events(user_id, event_at TIMESTAMP, event_name TEXT)

"Activity" usually means any meaningful product event (session, open_app, view_item, purchase). Adjust filters to match your product.

Worked examples

Example 1 — Monthly cohorts, Day-7 retention

Question: For each signup month, what percent of users return exactly 7 days after signup?

WITH base AS (
  SELECT 
    u.user_id,
    (DATE_TRUNC('month', u.signup_at))::date AS cohort_month,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
  -- optionally filter: WHERE e.event_name IN ('open_app','session','purchase')
)
SELECT 
  b.cohort_month,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END) AS retained_d7,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d7_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Notes: In some SQL dialects use DATEDIFF(day, b.signup_date, a.activity_date) = 7.

Example 2 — Weekly cohorts, Week-4 retention (days 21–27)

Question: What share of each weekly cohort is active in week 4 after signup (days 21–27 inclusive)?

WITH base AS (
  SELECT 
    u.user_id,
    DATE_TRUNC('week', u.signup_at)::date AS cohort_week,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT 
  b.cohort_week,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE 
    WHEN (a.activity_date - b.signup_date) BETWEEN 21 AND 27 THEN b.user_id END
  ) AS retained_w4,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 21 AND 27 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS w4_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Example 3 — Rolling 30-day retention (any activity days 1–30)

Question: For each signup month, what percent of users return at least once in the first 30 days?

WITH base AS (
  SELECT 
    u.user_id,
    DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
    u.signup_at::date AS signup_date
  FROM users u
),
acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT 
  b.cohort_month,
  COUNT(DISTINCT b.user_id) AS cohort_size,
  COUNT(DISTINCT CASE 
    WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 30 THEN b.user_id END
  ) AS retained_30d,
  ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 30 THEN b.user_id END)
        / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d30_rolling_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Query patterns and templates

Cohort key: DATE_TRUNC('week'|'month', signup_at)
Day index: activity_date::date - signup_date::date (use DATEDIFF in some dialects)
Avoid double counting: COUNT(DISTINCT user_id) in both numerator and denominator
Filter meaningful events only to reduce noise

Template — daily grid (cohort x day)

WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
), joined AS (
  SELECT b.cohort_month, b.user_id,
         (a.activity_date - b.signup_date) AS day_n
  FROM base b
  JOIN acts a ON a.user_id = b.user_id
  WHERE a.activity_date >= b.signup_date
)
SELECT cohort_month,
       day_n,
       COUNT(DISTINCT user_id) AS active_users
FROM joined
WHERE day_n BETWEEN 0 AND 30
GROUP BY 1,2
ORDER BY 1,2;

Use this to build heatmaps (retention curves) in your BI tool.

Exercises

Try these in your environment. The quick test is at the end of the page. Test is available to everyone; only logged-in users get saved progress.

Compute D7 retention for each signup month using users and events. Treat any event as activity. Output: cohort_month, cohort_size, retained_d7, d7_retention_pct.

Show solution

WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('month', u.signup_at)::date AS cohort_month,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT b.cohort_month,
       COUNT(DISTINCT b.user_id) AS cohort_size,
       COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END) AS retained_d7,
       ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) = 7 THEN b.user_id END)
             / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d7_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Expected shape (example):

cohort_month | cohort_size | retained_d7 | d7_retention_pct
2024-06-01   | 1200        | 420         | 35.0
2024-07-01   | 950         | 320         | 33.7
2024-08-01   | 1100        | 385         | 35.0

Checklist: used DATE_TRUNC for cohorts
Used COUNT(DISTINCT) for users
Compared day index to 7 exactly
Handled divide-by-zero with NULLIF

Exercise 2 — Rolling 14-day retention

For each signup week, compute the percent of users who return at least once in the first 14 days (days 1–14). Output: cohort_week, cohort_size, retained_14d, d14_rolling_retention_pct.

Show solution

WITH base AS (
  SELECT u.user_id,
         DATE_TRUNC('week', u.signup_at)::date AS cohort_week,
         u.signup_at::date AS signup_date
  FROM users u
), acts AS (
  SELECT e.user_id, e.event_at::date AS activity_date
  FROM events e
)
SELECT b.cohort_week,
       COUNT(DISTINCT b.user_id) AS cohort_size,
       COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 14 THEN b.user_id END) AS retained_14d,
       ROUND(100.0 * COUNT(DISTINCT CASE WHEN (a.activity_date - b.signup_date) BETWEEN 1 AND 14 THEN b.user_id END)
             / NULLIF(COUNT(DISTINCT b.user_id), 0), 1) AS d14_rolling_retention_pct
FROM base b
LEFT JOIN acts a ON a.user_id = b.user_id
GROUP BY 1
ORDER BY 1;

Checklist: cohort by week
Rolling window 1–14 days
Distinct users in numerator and denominator

Common mistakes

Counting events, not users. Fix: COUNT(DISTINCT user_id) for retention.
Using the wrong window. Be explicit: exact day (== 7) vs rolling window (BETWEEN 1 AND 7).
Timezone drift. Align timestamps to a single timezone and cast to ::date after.
Leaky cohorts. Ensure each user has one cohort (based on first signup). If multiple signups exist, use MIN(signup_at) per user.
Including pre-signup events. Filter activity_date >= signup_date.

Self-check tips

Spot-check a few users manually to verify day indices.
For fresh cohorts, retention on late days should be near zero (insufficient time). If not, windows are wrong.
Cohort size should equal DISTINCT users starting in that period, regardless of events.

Practical projects

Build a monthly cohort heatmap (0–60 days) and annotate feature launch dates.
Split D7 retention by acquisition channel (add channel to users) and compare.
Create a dashboard: D1, D7, D30 rolling retention for total and by platform (web/mobile).

Who this is for

Product Analysts who need to quantify user stickiness and compare cohorts.
Data/BI Analysts supporting growth, lifecycle, or product teams.

Prerequisites

Comfort with SQL SELECT, GROUP BY, JOIN.
Familiarity with date functions (DATE_TRUNC, casting to date, DATEDIFF equivalents).
Basic understanding of user-event schemas.

Learning path

Before this: Event schema basics, window and date functions.
This lesson: Cohorts, fixed-day and rolling retention.
Next: Segment-level retention (channel, geo), survival curves, and funnel-to-retention analyses.

Next steps

Parameterize your retention window (1, 7, 30) for easy comparisons.
Add segments (platform, plan, country) and watch how cohorts diverge.
Publish a weekly retention report with a short narrative of changes.

Mini challenge

Find the first cohort where D7 retention drops by more than 5 percentage points versus the previous cohort. List the cohort pair and the top 3 segments contributing to the drop.

Quick Test

Take the quick test below to check your understanding. Anyone can take it; only logged-in users will have progress saved.

Menu

Retention And Cohort Queries

Table of Contents

Why this matters

Concept explained simply

Mental model

Data model used in examples

Worked examples

Query patterns and templates

Exercises

Exercise 2 — Rolling 14-day retention

Common mistakes

Practical projects

Who this is for

Prerequisites

Learning path

Next steps

Mini challenge

Quick Test

Practice Exercises

Day-7 retention by signup month

Instructions

Expected Output

Rolling 14-day retention by signup week

Retention And Cohort Queries — Quick Test

Have questions about Retention And Cohort Queries?

AI Assistant

Menu

Retention And Cohort Queries

Table of Contents

Why this matters

Concept explained simply

Mental model

Data model used in examples

Worked examples

Query patterns and templates

Exercises

Exercise 1 — Day-7 retention by signup month

Exercise 2 — Rolling 14-day retention

Common mistakes

Practical projects

Who this is for

Prerequisites

Learning path

Next steps

Mini challenge

Quick Test

Practice Exercises

Day-7 retention by signup month

Instructions

Expected Output

Rolling 14-day retention by signup week

Retention And Cohort Queries — Quick Test

Have questions about Retention And Cohort Queries?

AI Assistant