Why this matters
Analytics Engineers ship trustworthy metrics. Many source systems send late or out-of-order events. If you group by when your pipeline saw data (processing time) instead of when it actually happened (event time), you can misstate revenue, DAU, or conversions. Knowing when to use each clock prevents bad dashboards, flaky tests, and painful backfills.
- Daily revenue rollups: late mobile purchases must count on the day of purchase (event time), not the day the data arrived.
- Streaming dashboards: near-real-time views may initially use processing time, then reconcile using event time as late data shows up.
- SLAs and monitoring: pipeline latency and alerts are measured by processing time.
- Backfills and reprocessing: consistently using event time makes re-runs deterministic.
Concept explained simply
- Event time: the timestamp of when the user action truly occurred (e.g., "user clicked at 14:03:02Z").
- Processing time: the timestamp of when your system ingested or processed the event (e.g., it arrived to your warehouse at 14:07:45Z).
Mental model
Imagine mailing a birthday card. The birthday is event time. The day the post office delivers it is processing time. For behavioral analytics and financials, celebrate on the birthday (event time). For postal efficiency metrics and alerts, care about delivery day (processing time).
Core concepts and terms
- Lateness: how long after the event occurred it arrives to your system.
- Watermark: an estimate of how far event time has likely advanced; events older than the watermark are considered late.
- Allowed lateness: how long you still accept updates to previously computed windows.
- Windows:
- Tumbling: fixed non-overlapping intervals (e.g., daily).
- Sliding: overlapping intervals (e.g., last 7 days rolling).
- Session: user activity-based intervals with a gap threshold.
- Deduplication: use stable IDs (event_id) or composite keys (user_id, event_time, type) to avoid double counting.
- Timezones: store event_time in UTC, keep original offsets as separate columns if needed for reporting.
Worked examples
Example 1 — Daily revenue with late arrivals
Goal: Sum revenue by the day the purchase happened (event time), not by arrival day.
-- Warehouse table: events(order_id, event_time_utc, ingested_at_utc, amount)
-- Correct (event time)
SELECT DATE_TRUNC('day', event_time_utc) AS d, SUM(amount) AS revenue
FROM events
WHERE event_time_utc >= '2024-06-01' AND event_time_utc < '2024-07-01'
GROUP BY 1;
-- Incorrect if used for business metrics (processing time)
SELECT DATE_TRUNC('day', ingested_at_utc) AS d, SUM(amount) AS revenue
FROM events
WHERE ingested_at_utc >= '2024-06-01' AND ingested_at_utc < '2024-07-01'
GROUP BY 1;Impact: Late events will inflate the wrong day with the second query.
Example 2 — Streaming dashboard with watermarks
Goal: 1-minute orders count with up to 10 minutes of allowed lateness.
- Initial result emitted as processing-time ticks, then updated as late events arrive.
- Watermark lags event time by 10 minutes; events older than watermark are flagged and either discarded or sent to a reconciliation job.
Pattern: emit updates with upserts keyed by (minute_bucket, dimension_keys) and re-run a periodic reconciliation job on event time.
Example 3 — Backfill after bug fix
Goal: Recompute weekly retention after fixing a parsing bug.
- Use event time to rebuild facts deterministically from raw logs.
- Keep both event_time_utc and ingested_at_utc so you can measure ingestion SLA separately.
- Idempotency: de-dupe using event_id or a unique business key.
-- Example de-dupe pattern
WITH ranked AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY event_id ORDER BY ingested_at_utc DESC
) AS rn
FROM raw_events
)
SELECT * FROM ranked WHERE rn = 1;Implementation patterns
- Always store both timestamps: event_time_utc and ingested_at_utc.
- Partition facts by event date if your warehouse benefits from pruning, but keep ingestion partitions for operational tables.
- Choose window semantics explicitly:
- Business metrics: event time windows.
- Operational/latency metrics: processing time windows.
- Streaming: set a watermark and allowed lateness; decide update strategy:
- Append-only and later reconciliation (simpler).
- Upsert-on-late-arrival (more accurate real-time, more complex).
- Timezones: normalize to UTC on ingest; store original timezone if you must display local dates.
- Quality checks: reject or quarantine events with impossible event_time (e.g., older than X days or far in the future).
How to choose event vs processing time
- If the question is “When did the user/business action happen?” — use event time.
- If the question is “When did the pipeline/system see or process it?” — use processing time.
- Need both views? Produce two models: business_facts_by_event_time and ops_latency_by_processing_time.
Exercises
Try these before opening the solutions. A checklist is below to self-verify.
Exercise 1 — Fix a misbucketed daily revenue
You have a table sales(order_id, event_time_utc, ingested_at_utc, amount). Yesterday, 5% of purchases arrived today due to a mobile outage. Write a query that returns yesterday’s revenue correctly using event time and explain one de-duplication safeguard.
Show hint
- Group by DATE_TRUNC('day', event_time_utc).
- Use a unique key (order_id) or ROW_NUMBER over order_id.
Show solution
SELECT DATE_TRUNC('day', event_time_utc) AS d, SUM(amount) AS revenue
FROM sales
WHERE event_time_utc >= DATE_TRUNC('day', NOW() - INTERVAL '1 day')
AND event_time_utc < DATE_TRUNC('day', NOW())
GROUP BY 1;
-- Safeguard: enforce uniqueness on order_id, or de-dupe using ROW_NUMBER() OVER (PARTITION BY order_id ORDER BY ingested_at_utc DESC)Exercise 2 — Define watermark and allowed lateness
You stream page_view events with occasional 20-minute delays from mobile. Design a 5-minute tumbling window aggregate that is accurate within 20 minutes and stabilizes afterward. Specify: (a) watermark policy, (b) allowed lateness, (c) how late updates are applied.
Show hint
- Watermark can lag event time by ~20–25 minutes.
- Use upserts keyed by (window_start, window_end).
Show solution
(a) Watermark = max(event_time) - 25 minutes (buffer above typical 20-minute delay).
(b) Allowed lateness = 25 minutes; events older than watermark go to a nightly reconciliation job.
(c) During allowed lateness, upsert aggregates for the affected 5-minute window. After watermark passes, keep aggregates stable and reconcile in batch.
Exercise checklist
- I grouped business metrics by event_time, not ingested_at.
- I included a deduplication strategy.
- I declared a watermark and allowed lateness clearly.
- I explained how updates or reconciliations occur.
Common mistakes and self-check
- Mistake: Grouping by processing time for business KPIs.
Self-check: Does your query use event_time_utc for business summaries? - Mistake: No dedup leads to double counting when retries occur.
Self-check: Do you enforce unique keys or use windowed ROW_NUMBER? - Mistake: Ignoring timezone alignment.
Self-check: Are all comparisons done in UTC, and do you convert for display only? - Mistake: Watermark too aggressive, drops legitimate late events.
Self-check: Did you analyze historical lateness distribution?
Practical projects
- Build a daily revenue model using event time with a 7-day reconciliation job that captures and backfills late events.
- Create a streaming page_view aggregate with a 10-minute watermark and allowed lateness; publish upserted metrics to a warehouse table.
- Implement a data quality monitor that flags events with event_time too far in the past/future and compares event and processing latency percentiles.
Who this is for
- Analytics Engineers and BI developers who own metrics and dashboards.
- Data Engineers building streaming/batch ingestion for analytics.
Prerequisites
- Comfort with SQL windowing and GROUP BY.
- Basic understanding of batch vs streaming pipelines.
- Working knowledge of timezones and UTC normalization.
Learning path
- Understand event vs processing time (this lesson).
- Practice windowing: tumbling, sliding, session windows.
- Implement dedup and idempotency in your models.
- Add watermarks and allowed lateness to streaming outputs.
- Build reconciliation and backfill routines.
Next steps
- Refactor one of your existing metrics to event time and measure the difference.
- Introduce a lateness report to inform watermark tuning.
- Document which models use which time basis and why.
Mini challenge
Your marketing events arrive hours late on weekends. Product managers want daily attribution accuracy, and leadership wants a near-real-time dashboard. Propose a design that satisfies both, listing: (1) event vs processing time usage, (2) watermark and allowed lateness, (3) how you reconcile.
Example approach
- Real-time dashboard: processing time with 30-minute freshness SLA; label as preliminary.
- Authoritative model: event time with 24-hour allowed lateness and nightly reconciliation job that upserts corrected windows.
- Publish a delta log describing late corrections for transparency.
Glossary quick ref
- Event time: when it happened.
- Processing time: when you saw it.
- Lateness: arrival delay.
- Watermark: boundary after which events are considered late.
- Allowed lateness: time you still accept updates.
Quick Test
The quick test is available to everyone. Only logged-in users get saved progress.