luvv to helpDiscover the Best Free Online Tools
Topic 2 of 8

Marketing Data Schemas Understanding

Learn Marketing Data Schemas Understanding for free with explanations, exercises, and a quick test (for Marketing Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

Marketing data rarely lives in one table. As a Marketing Analyst, you will join ad platform spend with web sessions and conversions, slice by campaign and creative, and explain performance by channel, device, or country. Understanding schemas ensures you pick the right grain, join on the right keys, and create trustworthy metrics like CTR, CPC, CPA, and ROAS.

  • Build a daily performance view across platforms with consistent campaign naming.
  • Explain which ads drive revenue using reliable joins and time windows.
  • Avoid double counting when combining events (impressions, clicks, sessions, orders).

Who this is for

Marketing Analysts and aspiring analysts who query ads, web analytics, and CRM data and need to combine them into clear, accurate insights.

Prerequisites

  • Basic SQL: SELECT, WHERE, GROUP BY, JOIN, CASE
  • Comfort with dates and simple aggregations
  • Familiarity with marketing concepts: campaign, ad, UTM, session, conversion

Concept explained simply

A marketing data schema is the map of how your tables relate. Facts hold events or numeric measurements (spend, clicks, orders). Dimensions describe entities (campaigns, ads, dates, customers). Most performance queries join one or more fact tables to several dimensions, then aggregate at a chosen grain (e.g., per day per campaign).

Mental model

  • Grain: the level of detail for each row (e.g., one row per ad per day).
  • Keys: IDs that connect tables (campaign_id, ad_id, date_key, user_id).
  • Time: when events happen and how you align them (same day, lookback windows).
  • Measures: what you sum or count (spend, impressions, clicks, revenue).
Core schema components (open for a quick reference)
  • Dimensions: dim_date(date_key,...), dim_campaign(campaign_id, platform, campaign_name,...), dim_ad(ad_id, campaign_id,...), dim_customer(user_id, country, device,...)
  • Facts: fact_impressions(date_key, ad_id, impressions), fact_clicks(date_key, ad_id, clicks), fact_spend(date_key, campaign_id, spend), fact_sessions(session_id, user_id, session_start_at, source, medium, campaign, ad_id), fact_orders(order_id, user_id, order_ts, revenue, campaign_id, ad_id)
  • Common keys: campaign_id links campaign-level facts; ad_id links ad-level facts; date_key links time; user_id links user behaviors.
  • Grain examples: fact_spend often at campaign-day; fact_impressions/clicks at ad-day; sessions at session-level; orders at order-level.

Worked examples

Example 1: Daily CTR and CPC by campaign (aligning grains)

Goal: Combine ad-level clicks with campaign-level spend. Step 1 aggregates clicks to campaign-day before joining to spend.

WITH clicks_by_campaign AS (
  SELECT c.date_key,
         a.campaign_id,
         SUM(c.clicks) AS clicks
  FROM fact_clicks c
  JOIN dim_ad a ON a.ad_id = c.ad_id
  GROUP BY 1,2
),
merged AS (
  SELECT s.date_key,
         s.campaign_id,
         s.spend,
         COALESCE(cb.clicks, 0) AS clicks
  FROM fact_spend s
  LEFT JOIN clicks_by_campaign cb
    ON cb.date_key = s.date_key
   AND cb.campaign_id = s.campaign_id
)
SELECT date_key,
       campaign_id,
       clicks,
       spend,
       CASE WHEN clicks > 0 THEN spend::decimal / clicks ELSE NULL END AS cpc
FROM merged
ORDER BY date_key, campaign_id;

Key ideas: define target grain (campaign-day), aggregate the more granular table (ad-day) up, then join.

Example 2: Daily ROAS by campaign (spend + orders)

Goal: Sum order revenue and join to spend at campaign-day.

WITH orders_by_campaign AS (
  SELECT CAST(order_ts AS DATE) AS order_date,
         campaign_id,
         SUM(revenue) AS revenue
  FROM fact_orders
  GROUP BY 1,2
)
SELECT s.date_key AS dt,
       s.campaign_id,
       s.spend,
       COALESCE(o.revenue, 0) AS revenue,
       CASE WHEN s.spend > 0 THEN COALESCE(o.revenue,0) / s.spend ELSE NULL END AS roas
FROM fact_spend s
LEFT JOIN orders_by_campaign o
  ON o.order_date = s.date_key
 AND o.campaign_id = s.campaign_id
ORDER BY dt, campaign_id;

Assumption: orders carry campaign_id. If not, build a mapping (e.g., last non-null UTM campaign) before aggregating.

Example 3: De-duplicating clicks (clean fact table)

Goal: one row per ad_id + user_id + minute. Use row_number to keep first click in a minute.

WITH clicks_ranked AS (
  SELECT *,
         DATE_TRUNC('minute', click_ts) AS click_minute,
         ROW_NUMBER() OVER (
           PARTITION BY ad_id, user_id, DATE_TRUNC('minute', click_ts)
           ORDER BY click_ts
         ) AS rn
  FROM fact_clicks_raw
)
SELECT ad_id, user_id, click_minute AS click_ts, other_cols
FROM clicks_ranked
WHERE rn = 1;

Cleaning at the correct grain avoids inflated CTR/CPA.

Common mistakes and how to self-check

  • Grain mismatch: You join ad-day clicks to campaign-day spend without aggregating. Self-check: Is the GROUP BY level on both sides identical?
  • Wrong keys: Joining on campaign_name instead of campaign_id. Self-check: Prefer stable IDs; compare row counts before/after join.
  • Double counting revenue: Joining orders to both clicks and sessions. Self-check: Track unique order_id counts before/after joins.
  • Time misalignment: Comparing spend on day D to orders on day D+1 unintentionally. Self-check: Confirm date fields and time zones.
  • Null explosion: INNER JOIN dropping rows with zero clicks or missing dimensions. Self-check: Use LEFT JOIN from the driving fact (e.g., spend).

Exercises

These mirror the tasks below. Do them in SQL using the example schemas. A quick checklist:

  • Did you define the target grain before writing joins?
  • Did you aggregate more granular tables up to the target grain?
  • Did you join on IDs, not names?
  • Did you validate row counts and totals pre/post join?

Exercise 1: Daily ROAS by campaign

Compute daily ROAS by campaign using fact_spend(date_key, campaign_id, spend) and fact_orders(order_id, user_id, order_ts, revenue, campaign_id). Output columns: date_key, campaign_id, spend, revenue, roas.

Need a nudge?
  • Aggregate orders by CAST(order_ts AS DATE), campaign_id.
  • LEFT JOIN orders to spend by date and campaign_id.
  • ROAS = revenue / spend (handle division by zero).

Exercise 2: Fix duplicates in clicks

Clean fact_clicks_raw(click_id, ad_id, user_id, click_ts, ...). Keep one row per ad_id + user_id + minute (first click wins). Output columns: ad_id, user_id, click_ts_minute.

Need a nudge?
  • Use DATE_TRUNC('minute', click_ts) and ROW_NUMBER over a partition.
  • Filter rn = 1.

Practical projects

  • Unified daily marketing mart: Build a table per campaign per day with spend, impressions, clicks, sessions, orders, revenue, and derived metrics.
  • Attribution-ready session mapping: Create a table that assigns each session and order a stable campaign_id using UTM parsing and last-touch within 7 days.
  • Creative performance report: Join ad-level facts to creative metadata to rank top 10 creatives by CTR and ROAS.

Learning path

  • Now: Understand schemas, grains, and keys for reliable joins.
  • Next: Window functions for deduplication and attribution.
  • Then: Incremental models and slowly changing dimensions basics.

Next steps

  • Turn the worked examples into parameterized queries for your team.
  • Add data quality checks: uniqueness of keys, null rates, metric reconciliation.
  • Document your canonical grains and join keys for each report.

Mini challenge

You must deliver a weekly performance summary: platform, campaign, week_start, spend, clicks, orders, revenue, roas. Your data: fact_spend (campaign-day), fact_clicks (ad-day), dim_ad (ad to campaign), fact_orders (order-level with campaign_id). What is your target grain, and which tables do you aggregate?

Show a possible plan
  1. Target grain: campaign-week.
  2. Aggregate clicks: ad-day to campaign-day via dim_ad, then to campaign-week.
  3. Aggregate spend: campaign-day to campaign-week.
  4. Aggregate orders: order_ts to week_start and campaign_id.
  5. Join three weekly aggregates on campaign_id and week_start; compute ROAS.

How the test works

The quick test is available to everyone. If you log in, your progress and results will be saved.

Practice Exercises

2 exercises to complete

Instructions

Using fact_spend(date_key, campaign_id, spend) and fact_orders(order_id, user_id, order_ts, revenue, campaign_id), produce daily ROAS by campaign. Output columns: date_key, campaign_id, spend, revenue, roas (revenue/spend). Handle days with no orders.

Expected Output
A result set with one row per date_key, campaign_id, including spend, revenue (0 if none), and roas (NULL if spend=0).

Marketing Data Schemas Understanding — Quick Test

Test your knowledge with 7 questions. Pass with 70% or higher.

7 questions70% to pass

Have questions about Marketing Data Schemas Understanding?

AI Assistant

Ask questions about this tool