Why this matters
Client–server event consistency ensures the numbers you show the business are trustworthy. As a Product Analyst, you will compare funnels, conversion rates, and revenue across dashboards. If client- and server-side events disagree, decisions stall or go the wrong way. Consistent events mean faster insights, cleaner experiments, and fewer emergency data fixes.
- Real task: reconcile purchase counts between client analytics and backend orders after a release.
- Real task: design an event schema so A/B test conversion is consistent across web, app, and backend.
- Real task: create monitoring to catch event drift before stakeholders do.
Who this is for
- Product Analysts who use both client analytics and backend data.
- Analytics Engineers and Data Scientists partnering with product teams.
- PMs wanting to sanity-check funnel and revenue metrics.
Prerequisites
- Basic SQL (joins, window functions).
- Understanding of event tracking concepts (events, properties, sessions).
- Familiarity with client vs server data capture paths.
Concept explained simply
Client–server event consistency means both sides describe the same real-world action with the same identity, timing, and meaning. We accept that transport and timing differ, but we engineer the data so aggregates align within agreed tolerances.
Mental model
Think of each user action as a single fact. The client and server are two cameras filming the same moment. Your job is to label the clip with a shared ID, standard fields, and rules so the two angles can be merged into one reliable story.
Core principles of consistency
- One schema, two sources: Use a shared event schema. Required fields:
event_name,event_id(UUID, idempotency key),user_id,device_id,session_id,client_ts,server_ts,ingestion_ts,source(client/server),schema_version,context(page, screen, campaign), and business fields (e.g.,currency,revenue). - Idempotency: The same action must produce the same
event_idon both sides. Generate the ID once (client or server) and pass it along. - Authoritative fields: Define a source of truth by field. Example: revenue and tax are authoritative from the server; UI language or screen path from the client.
- Time handling: Keep
client_ts(device time) andserver_ts(backend time). For aggregates, useserver_tsas default to avoid clock skew; fall back toclient_tsif missing. - Ordering: Include a
sequence_idor monotonic counter per session when ordering matters (e.g., checkout steps). - De-duplication rules: Prefer
event_id; if missing, dedupe by a composite key (user_id, event_name, server_ts minute, important properties) with a short window. - Retry and loss: Implement retriable delivery and queueing on both sides; add
retry_countanddelivery_statusfor observability. - Sampling and consent: Apply the same sampling decision (deterministic hash on user_id) and include
consent_stateso pipelines can drop or keep events consistently. - Versioning: Use
schema_version. Breaking changes require a migration plan and dual write period.
Worked examples
Example 1 — Double-fired Add to Cart
Issue: Client sends add_to_cart on button click; server also emits on API confirmation. Dashboard shows 120% of expected adds.
- Fix: Generate
event_idon client and include it in the API payload; server uses the sameevent_id. Deduplicate in the warehouse withevent_id, keeping server as primary. - Rule: If both sources exist, keep server row; otherwise keep the lone row.
Example 2 — Revenue mismatch
Issue: Client purchase revenue = 10.00; server = 9.99 due to rounding and discounts.
- Fix: Declare server as authoritative for monetary fields. Client still sends cart display values for UX analysis, but metrics use server fields.
- Warehouse rule: For
purchase, pick server revenue when present; else client as fallback with a flagis_server_authoritative.
Example 3 — Login success counts differ
Issue: Client shows more successes than server. Cause: client fires on UI state change even if backend later rejects token refresh.
- Fix: Only mark
login_successserver-authoritative; client emitslogin_attempt. Add a join key to connect attempt and result. - Monitoring: Track
success/attemptratio; alert if outside 0.9–1.1 vs baseline.
Implement both sides: step-by-step
List required fields, decide authoritative ownership per field, and define the dedupe rule.
Choose the origin (client or server). Pass
event_id through requests/responses.Emit events with
event_id, client_ts, context, and consent. Retry on failure with backoff.On API success, emit the same
event_id. Compute authoritative fields (revenue, tax).Use at-least-once delivery; include
retry_count and idempotency keys in ingestion.Build a
clean_events table with dedupe logic and source preference rules.Create ratios (client/server per event_name), latency distributions, and schema drift checks.
QA and monitoring
- Staging tests: trigger actions once; assert one canonical row post-dedupe.
- Ratios: daily
client_count/server_countby event_name; alert if outside thresholds. - Latency: track
server_ts - client_tsto detect network issues. - Drift: compare distributions of key properties by source.
- Data contracts: reject events missing required fields or with invalid types.
Learning path
- Review event schemas you already use; list gaps vs the core fields above.
- Add
event_idand authoritative field ownership to one critical flow (signup or purchase). - Implement warehouse dedup + source preference logic.
- Set up ratio monitoring and a basic alert.
- Extend the approach to other top-5 events.
Exercises
Complete these to practice. The quick test is available to everyone; log in to save your progress.
Exercise 1 — Design an idempotency plan
You own checkout_started and purchase. Define how the same user action will carry the same event_id across client and server, specify authoritative fields, and write the warehouse dedupe rule.
- Deliverables: list of required fields, source ownership, and a short dedupe priority (server vs client).
Exercise 2 — Write the dedupe SQL
Given table raw_events(event_name, event_id, source, user_id, client_ts, server_ts, revenue, schema_version), write SQL to:
- Keep one row per
event_id, preferringsource='server'. - For
purchase, takerevenuefrom the server row when present; else use the client row and flag it.
Exercise checklist
- Declared a single shared
event_idlifecycle across client and server. - Marked authoritative fields (e.g., revenue, tax) and non-authoritative fields (e.g., screen path).
- Wrote a deterministic dedupe rule.
- Handled missing server rows gracefully.
Common mistakes and self-check
- No shared ID: If you cannot join client and server by
event_id, expect permanent drift. Self-check: pick 20 recent purchases and verify joinable IDs. - Clock skew issues: Using
client_tsfor hourly metrics causes spikes. Self-check: compare hour buckets byclient_tsvsserver_ts. - Rounding differences: Client revenue rounding differs from server tax logic. Self-check: compare revenue deltas by order_id and currency.
- Partial retries: Retries on client only. Self-check: measure event loss rates by network error code.
- Sampling mismatch: Client sampled, server not. Self-check: ensure a shared sampling key and rate.
- Consent drift: Client drops events on opt-out but server keeps them. Self-check: add
consent_stateand unify policy.
Practical projects
- Project 1: Build a
clean_eventsmodel with dedupe and source preference, plus a dashboard showing client/server ratios for top events. - Project 2: Instrument a full checkout flow using a shared
event_id, with automated QA that diff-checks payloads across sources. - Project 3: Create a consistency SLO (e.g., purchase count skew within 5%) and an alerting notebook that explains deviations.
Mini challenge
Yesterday, purchase client/server ratio jumped from 1.02 to 1.18. Server code deployed a new discount rule. In 15 minutes, outline the 3 fastest checks to isolate the cause and one rollback-safe metric to report meanwhile.
Next steps
- Roll the shared schema to one more event family (auth or onboarding).
- Automate drift checks and publish a weekly consistency report.
- Document authoritative fields and share with engineering and product teams.