Note: The quick test is available to everyone; only logged-in users will have their progress saved.
Why this matters
Segmentation by product dimensions helps you answer questions like:
- Which app version drives the most active users?
- Do signups differ by plan tier and device type?
- Which product categories generate the most revenue?
- Does adopting a key feature improve retention?
These are routine tasks for Product Analysts to inform feature prioritization, pricing, go-to-market, and quality improvements.
Concept explained simply
Segmentation means grouping metrics by product attributes (dimensions) such as platform, app_version, plan_tier, product_category, price_tier, or feature_adoption. In SQL, you typically:
- Select the metric (e.g., users, sessions, revenue).
- Choose dimensions to group by (e.g., platform, category).
- Filter to the correct time window and event types.
- Aggregate and format the result (counts, sums, rates).
Mental model
- Facts: events, orders, sessions (things that happen).
- Dimensions: attributes describing facts (app_version, category).
- Join facts to dimensions → group and aggregate → segment insights.
Assumed schema for examples (adjust to your warehouse)
- users(user_id, signup_at, device_type, signup_plan)
- events(user_id, event_name, occurred_at, platform, app_version, product_id, feature_key)
- products(product_id, category, brand, price_tier)
- orders(order_id, user_id, product_id, order_at, revenue)
Time columns are timestamps; revenue is numeric; platform might be 'iOS'/'Android'/'Web'.
Worked examples
1) DAU by platform and app_version (last 14 days)
-- Daily Active Users by platform and app_version
WITH base AS (
SELECT
date_trunc('day', occurred_at) AS d,
platform,
COALESCE(app_version, 'unknown') AS app_version,
user_id
FROM events
WHERE occurred_at >= current_date - interval '14 days'
AND event_name IN ('session_start','app_open')
)
SELECT
d,
platform,
app_version,
COUNT(DISTINCT user_id) AS dau
FROM base
GROUP BY 1,2,3
ORDER BY d, platform, app_version;
- Tip: COALESCE null app versions to "unknown" to avoid losing rows.
- Use DISTINCT users to avoid multiple events per user in a day.
2) Signups by plan tier and device (last 7 days)
-- Signups by signup_plan and device_type
SELECT
u.signup_plan,
u.device_type,
COUNT(*) AS signups_last_7d
FROM users u
WHERE u.signup_at >= current_date - interval '7 days'
GROUP BY 1,2
ORDER BY signups_last_7d DESC;
- Group by the attributes you want to compare (plan Ă— device).
- If you need zero-rows for missing combos, pre-build a dimension grid and LEFT JOIN.
3) Revenue by product_category and price_tier (last 30 days)
-- Gross revenue segmented by category and price_tier
SELECT
p.category,
p.price_tier,
SUM(o.revenue) AS revenue_30d
FROM orders o
JOIN products p ON p.product_id = o.product_id
WHERE o.order_at >= current_date - interval '30 days'
GROUP BY 1,2
ORDER BY revenue_30d DESC;
- Join orders (fact) to products (dimension) to expose attributes.
- Aggregate after the join to avoid duplicate rows.
4) D7 retention segmented by feature adoption
-- Users who adopted a feature within 3 days of signup vs their D7 return
WITH cohort AS (
SELECT u.user_id, date_trunc('day', u.signup_at) AS signup_day
FROM users u
WHERE u.signup_at >= current_date - interval '30 days'
),
first3d_feature AS (
SELECT DISTINCT e.user_id
FROM events e
JOIN cohort c ON c.user_id = e.user_id
WHERE e.event_name = 'feature_used'
AND e.feature_key = 'feature_x'
AND e.occurred_at < c.signup_day + interval '3 days'
),
return_d7 AS (
SELECT DISTINCT e.user_id
FROM events e
JOIN cohort c ON c.user_id = e.user_id
WHERE e.occurred_at >= c.signup_day + interval '7 days'
AND e.occurred_at < c.signup_day + interval '14 days'
)
SELECT
CASE WHEN f.user_id IS NOT NULL THEN 'adopted_feature_x_early' ELSE 'did_not_adopt_early' END AS segment,
COUNT(*) AS users,
COUNT(r.user_id) * 1.0 / COUNT(*) AS d7_retention_rate
FROM cohort c
LEFT JOIN first3d_feature f ON f.user_id = c.user_id
LEFT JOIN return_d7 r ON r.user_id = c.user_id
GROUP BY 1
ORDER BY 1;
- Build cohorts first, then mark segments, then compute the metric.
- Use LEFT JOIN to keep both segments even if returns are zero.
How to approach segmentation queries
- Define the business question and metric (count users, sum revenue, rate).
- Pick dimensions (platform, version, category, plan, feature adoption).
- Specify the time window precisely (rolling vs calendar).
- Build clean base sets (CTEs) before joining and aggregating.
- Choose joins carefully to include/exclude empty segments.
- Validate totals against a non-segmented run.
Exercises (practice)
These mirror the tasks below. Solve them in your SQL editor. Then compare with the suggested solutions.
Exercise 1 — Signups by plan and device (7 days)
Using users table, return signup_plan, device_type, and signups_last_7d for the last 7 days.
- Ensure nulls are handled sensibly.
- Sort by signups_last_7d descending.
Exercise 2 — Average revenue per purchasing user by category (60 days)
Using orders and products, compute for the last 60 days:
- category
- avg_revenue_per_purchasing_user = SUM(revenue) / COUNT(DISTINCT user_id) where users purchased in that category
Order by avg_revenue_per_purchasing_user descending.
Checklist before you run
- Time window filters applied correctly.
- Nulls coalesced for grouping keys where appropriate.
- Distinct users used where required.
- Aggregations occur after necessary joins.
- Totals reconcile with non-segmented metrics.
Common mistakes and self-check
- Double counting after joins: aggregate the fact first or deduplicate keys before joining.
- Dropping segments: INNER JOIN removes segments with no facts; use LEFT JOIN + COALESCE.
- Null dimension values: group them, do not silently exclude.
- Mismatched windows: filters on events vs users not aligned with the question.
- Distinct on wrong field: distinct events vs distinct users changes the story.
Self-check: remove GROUP BY to compare overall totals; if they differ unexpectedly, revisit joins and distincts.
Practical projects
- Build a weekly KPI report segmented by platform and app_version (DAU, signups, revenue).
- Create a price_tier Ă— category revenue matrix for the last quarter.
- Analyze D1/D7 retention across signup_plan and device_type; present as a small dashboard.
- Identify top features associated with higher conversion or retention using adoption segments.
Who this is for
- Product Analysts needing to answer product performance questions quickly.
- Data-savvy PMs validating hypotheses about features, pricing, and platforms.
- Analysts preparing for SQL/product analytics interviews.
Prerequisites
- Comfort with SELECT, WHERE, GROUP BY, HAVING, ORDER BY.
- Joins (INNER, LEFT), CTEs, and basic window functions.
- Understanding of product metrics (DAU, retention, conversion, revenue).
Learning path
- Count and sum metrics by one dimension (platform).
- Two-dimensional segmentation (platform Ă— app_version).
- Join facts to product dimensions (orders → products).
- Derived segments (feature adoption cohorts).
- Rates and ratios (conversion, retention) with careful denominators.
Next steps
- Parameterize your time window (e.g., last N days) to reuse queries.
- Create views for common base tables (clean events, orders).
- Set guardrails: null handling, de-duplication, and timezones.
Mini challenge
Write a query to compute 30-day repeat purchase rate by price_tier:
- For users with any purchase in the last 30 days, what fraction also made 2+ orders in the same period?
- Segment by products.price_tier.