luvv to helpDiscover the Best Free Online Tools
Topic 7 of 8

Detecting Missing And Broken Events

Learn Detecting Missing And Broken Events for free with explanations, exercises, and a quick test (for Product Analyst).

Published: December 22, 2025 | Updated: December 22, 2025

Why this matters

As a Product Analyst, your insights are only as good as the data behind them. Missing or broken events quietly distort funnels, A/B tests, retention, and revenue attribution. Detecting issues early prevents weeks of misleading dashboards and wrong decisions.

  • Find the real cause of sudden funnel drops
  • Catch schema changes that break dashboards
  • Stop double-counting or sample bias before it spreads
  • Give engineering clear, actionable fixes

Who this is for

  • Product Analysts and Data Analysts working with event data
  • Product Managers who own KPI dashboards
  • Engineers instrumenting product analytics

Prerequisites

  • Basics of event tracking: event name, timestamp, user/session IDs, properties
  • Comfort with funnels and cohort definitions
  • Ability to run simple queries or use your analytics tool’s charts

Concept explained simply

Think of your event stream like a heartbeat monitor for your product. When events go missing, the heart skips beats. When events are broken, the signal is noisy or mislabeled.

  • Missing events: expected events stop arriving (volume collapses, gaps in time)
  • Broken events: events arrive but are wrong (name changed, properties missing, duplicates, wrong timestamp)

Mental model: Volume, Shape, Flow

  • Volume: how many events per time unit or per active user
  • Shape: schema correctness (properties exist, types/values valid)
  • Flow: relationships between events (ratios, funnels, cause–effect)

Detect by watching all three. If any axis changes suddenly without a plausible product reason, investigate.

Core detection signals

  • Volume anomalies: sharp drop/spike vs 7/28-day baseline
  • User-normalized ratios: events per active user/device
  • Schema validity: required properties present and typed correctly
  • Topology ratios: downstream-to-upstream event ratio (e.g., CheckoutCompleted / AddToCart)
  • Freshness: events arriving too late or with future timestamps
  • Duplicates: unusually high event-per-user per minute
Show quick checks
  • Day-over-day change beyond threshold (e.g., 30–50%)
  • Property null rate drift (e.g., >5–10% increase)
  • Version change detected in payload (event_version, sdk_version)
  • Time skew: created_at vs received_at difference

Worked examples

Example 1: Sudden drop in AddToCart

Signal: AddToCart events down 65% day-over-day. Active users only down 2%.

  • Check ratio: AddToCart per active user fell from 0.9 to 0.3
  • Upstream event (ProductViewed) steady; downstream (CheckoutStarted) also down
  • Conclusion: Missing event, likely SDK or feature-gating change
  • Action: Confirm recent release; verify SDK init on product pages; request hotfix
What would prove it?
  • Staging environment still emits AddToCart
  • Only Web platform affected; Mobile stable
  • Release notes show deprecated click handler

Example 2: SignupCompleted missing plan_id

Signal: Revenue by plan dashboard flatlines for new users; total signups normal.

  • Null rate for plan_id jumped from 1% to 60%
  • event_version updated to v3 yesterday
  • Conclusion: Broken event schema (missing property), not missing event
  • Action: Ask engineering to re-add plan_id or map new property planTier to plan_id
Self-check
  • Verify plan distribution pre-change vs post-change
  • Ensure downstream billing table still has plan

Example 3: Double-counted Purchase

Signal: Revenue 30% higher than payment processor; Purchase per user spiked.

  • Duplicate bursts within 2 seconds for same user/order_id
  • Autotrack + manual track both firing on same button
  • Conclusion: Broken event (duplicate semantics)
  • Action: Add idempotency key (order_id), dedupe rule, or disable autotrack on that element

Step-by-step detection playbook

  1. Scan volume and ratios
    • Events per active user/device
    • Downstream/Upstream ratios (e.g., Purchase / CheckoutStarted)
  2. Check schema shape
    • Required properties present
    • Type/enum validity; null rate changes
  3. Segment by dimension
    • Platform, app version, region, release channel
  4. Time sanity
    • Late arrivals, future timestamps, timezone shifts
  5. Duplicates and spikes
    • Per-user-per-minute caps; identical payloads
  6. Trace to change
    • Recent deployments, feature flags, SDK updates
Fast 30-minute triage
  1. Compare today vs last 7-day median
  2. Normalize by active users
  3. Check two key properties null rate
  4. Split by platform/app version
  5. Sample 20 raw events for eyeballing

Instrumentation verification checklist

  • Event fires exactly once per user action
  • Required properties always present
  • Stable event name and casing
  • Event version tracked on schema change
  • User/session IDs present and consistent
  • Timestamps in UTC; no future times
  • Backoff/retry configured for network failures

Common mistakes and how to self-check

  • Watching raw volume only
    • Self-check: Always compute event per active user
  • Assuming marketing seasonality explains drops
    • Self-check: Compare ratio to upstream event; seasonality affects both
  • Ignoring null rates
    • Self-check: Track property presence as a KPI
  • Not segmenting by app version
    • Self-check: Build a version breakdown chart
  • Fixing dashboards instead of data
    • Self-check: Validate raw payload before patching charts

Practical projects

  • Build an anomaly view: for 5 key events, chart 7-day rolling ratio to upstream and null rates for 3 required properties
  • Create a schema contract: define required properties and types; include event_version; mock alarms for >10% null increase
  • Implement dedupe logic: design an idempotency rule using order_id within a 5-minute window

Exercises (do these before the quick test)

Note: The quick test is available to everyone. Only logged-in users will have their progress saved.

  • Exercise 1: Find a missing event via ratio analysis (see below)
  • Exercise 2: Diagnose a broken schema via null rates (see below)
Exercise 1 data and hints

You observe the following yesterday vs 7-day median:

  • Active users: -3%
  • ProductViewed: -4%
  • AddToCart: -52%
  • CheckoutStarted: -49%

Task: Determine if AddToCart is missing or broken. Specify two additional checks to confirm.

Exercise 2 data and hints

SignupCompleted has required properties: user_id, plan_id, source. Yesterday property presence:

  • user_id: 100%
  • plan_id: 42% (was 99% last week)
  • source: 99%

Task: Is this missing or broken? Propose a remediation and an alert threshold.

Mini challenge

Your funnel is ProductViewed → AddToCart → CheckoutStarted → Purchase. Today, Purchase is flat, CheckoutStarted down 20%, AddToCart down 3%, ProductViewed flat. What’s your top hypothesis and first query?

Possible approach

Hypothesis: CheckoutStarted event broken or gated by a release. First query: CheckoutStarted per active user by app_version/platform; check null rate for cart_id on CheckoutStarted.

Learning path

  • Before: Event design and naming; consistent IDs
  • This lesson: Detect missing and broken events
  • Next: Alerting and data contracts; rollout validation and canary checks

Next steps

  • Set baselines and thresholds for your top 10 events
  • Add an event_version property where missing
  • Schedule a weekly 15-minute data health review

Practice Exercises

2 exercises to complete

Instructions

You have yesterday vs 7-day median metrics:

  • Active users: -3%
  • ProductViewed: -4%
  • AddToCart: -52%
  • CheckoutStarted: -49%

Decide if AddToCart is missing or broken. List two confirming checks and one likely root cause.

Expected Output
Classification (missing vs broken), two confirming checks, and a plausible root cause.

Detecting Missing And Broken Events — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Detecting Missing And Broken Events?

AI Assistant

Ask questions about this tool