Topic Not Found

Who this is for

This lesson is for data engineers who build or maintain streaming pipelines, real-time dashboards, or event-driven systems. If you work with clickstreams, sensors, logs, or payments, understanding event time vs processing time will save you from silent data errors.

Prerequisites

Basic understanding of streams vs batches
Familiarity with windows (tumbling, sliding, session) at a high level
Comfort reading simple timelines and counts

Why this matters

Accurate analytics: Business wants “What happened at 12:00–12:05?” not “When did our system see it?”.
Late/out-of-order data: Mobile networks, retries, and clock skew are normal. Your pipeline must handle them.
Stable backfills: Reprocessing yesterday should produce the same results as live processing.

Real tasks you will face

Counting unique users per 5-minute window with late events
Detecting fraud sessions even if transactions arrive out of order
Running reliable hourly aggregates that match historical replays

Concept explained simply

Event time

The time an event actually happened at the source (e.g., device timestamp). Use it when your metric is about reality.

Processing time

The time your system saw the event. Use it when your metric is about the pipeline itself (throughput, lag) or when you need fast, approximate results.

Mental model

Think of events as letters with a date printed on them (event time). Your post office sorts them the day they arrive (processing time). If you group by the printed date, you need rules for late letters. Those rules are watermarks and allowed lateness.

Core terms (keep these handy)

Watermark: The system’s guess of “we’ve probably seen all events up to time T.” Typically T = max(event_time_seen) − delay.
Allowed lateness: How long after a window’s end you still accept late updates for that window.
Triggers: When to emit results (on-time, early, late updates).
Event-time windows: Group by when things happened. Robust to out-of-order events.
Processing-time windows: Group by when data arrives. Simple and low-latency but order-sensitive.

Worked examples

Example 1: Clicks arriving late

1-minute tumbling event-time windows. Watermark = max(event_time) − 30s. Allowed lateness = 20s.

Events (ET = event time, AT = arrival time)
- e1: ET 12:00:10, AT 12:00:11  -> window 12:00:00–12:00:59
- e2: ET 12:00:50, AT 12:01:05  -> late but same window
- e3: ET 12:01:20, AT 12:01:21  -> window 12:01:00–12:01:59
- e4: ET 12:01:40, AT 12:01:41  -> advances watermark to 12:01:10
- e5: ET 12:00:55, AT 12:01:25  -> late update for 12:00 window (accepted)
- e6: ET 12:02:10, AT 12:02:11  -> watermark to 12:01:40 (finalize 12:00 window)
- e7: ET 12:00:20, AT 12:02:12  -> too late (dropped)

Emissions for 12:00 window:
- On-time firing when WM > 12:00:59 (after e4): count = 2 (e1,e2)
- Late update (e5): count = 3
- Final (after e6): count = 3; e7 dropped

Example 2: Sensor clocks skewed

Ten sensors report temperatures with up to 2 minutes skew. If you use processing-time windows, values cluster by arrival bursts and misrepresent reality. Use event-time windows with a watermark delay a bit larger than skew (e.g., 2m30s) so readings land in their correct time buckets, with late updates allowed briefly.

Example 3: Sessionization

Fraud detection uses session windows with 30s gap by event time. Late transactions should still attach to the correct session if they arrive within allowed lateness. Processing-time sessions would split a single real-world session into many fragments.

Step-by-step: choosing the right time domain

Define the question — Are you measuring reality (use event time) or system behavior (use processing time)?
Estimate disorder — What’s the typical max delay/clock skew? Start with p95–p99 delay as watermark delay.
Pick windows — Tumbling/sliding for periodic metrics; sessions for user flows.
Set watermark and allowed lateness — Watermark delay slightly above expected skew; allowed lateness small but non-zero for corrections.
Decide triggers — On-time mandatory; add early firings for fast guesses; allow late firings for corrections.
Plan idempotency — Make outputs upsertable (keys + versions) so late updates don’t double-count.

Quick checklist before you ship

Time zone normalized (UTC)?
Event timestamp extracted and validated?
Watermark delay justified by data?
Allowed lateness documented with consumers?
Outputs are upsert-friendly?

Exercises (do these now)

Note: Solutions are hidden under each exercise. The Quick Test at the end is available to everyone; only logged-in users get saved progress.

Exercise 1: Compute event-time window results with lateness

Mirror of exercise ex1 below.

What to produce

State the counts emitted for the 12:00:00–12:00:59 window at each firing and which events are dropped.

Exercise 2: Choose time domains per pipeline step

Mirror of exercise ex2 below.

What to produce

For each step, pick event time or processing time and justify briefly.

Self-check: Did you explicitly state watermark delay and allowed lateness?
Self-check: Would backfill produce identical results?
Self-check: Is your consumer OK with late updates?

Common mistakes and how to self-check

Using processing-time windows for business KPIs — Self-check: If the network paused for 5 minutes, would your KPI spike or dip incorrectly? If yes, switch to event time.
Watermark too aggressive — Self-check: Compare late-drop rate vs p99 inter-arrival delay. If many drops before p99, increase delay.
No allowed lateness — Self-check: Do you ever receive retries? If yes, allow a small lateness window.
Non-idempotent sinks — Self-check: Can you upsert late corrections without duplicates? If not, redesign keys/merge logic.
Ignoring timezones — Self-check: Are timestamps normalized to UTC before windowing?

Practical projects

Real-time signups: Build a 5-minute event-time aggregate of signups per country with a 2-minute watermark and 1-minute allowed lateness. Emit on-time and late updates to a table keyed by (window_start,country).
Sessionized clicks: Session window by event time with 30s gap. Write each session’s duration and click count. Verify with synthetic out-of-order data.
Lag dashboard: Processing-time metric: records processed per minute and end-to-end latency percentiles. This validates your watermark choice empirically.

Mini tasks to extend

Add an early trigger every 10s for quick approximations, then reconcile with late updates.
Measure and log the fraction of events arriving after on-time firing but within allowed lateness.
Run a 24h backfill and compare outputs to live: counts and distinct keys must match.

Mini challenge

You ingest mobile app events. 20% arrive up to 90s late; rare outliers up to 3 minutes late. Stakeholders want 1-minute accurate counts, visible quickly but correct eventually.

Pick time domain for the metric
Propose watermark delay
Set allowed lateness
Describe trigger strategy

Learning path

Now: Event time vs processing time fundamentals (this lesson)
Next: Watermarks and triggers in depth
Then: Window types (tumbling, sliding, session) with trade-offs
Finally: Exactly-once, idempotent sinks, and reprocessing

Next steps

Write down your current pipeline’s watermark, allowed lateness, and trigger rules.
Run a 1-hour experiment measuring late arrival distribution; adjust watermark.
Take the Quick Test below to lock in concepts.

Menu

Event Time Versus Processing Time

Table of Contents