Why this matters
BI Developers turn vague stakeholder requests into precise, testable logic. Filters and segments drive what rows enter a query and which audiences or cohorts are compared. If you misinterpret them, numbers become untrustworthy and decisions go the wrong way.
- Real tasks you will face: define "active user" for a weekly dashboard, exclude internal traffic, build a reusable "high-value customers" segment, and specify time windows like "MTD" or "last full quarter".
- Clear filter/segment specs reduce rework, align teams, and make dashboards consistent across tools.
Note: The quick test is available to everyone. Progress is saved only if you are logged in.
Concept explained simply
• Filter: a rule that includes or excludes records for a query or visualization (e.g., country = "DE", date between Jan 1–31).
• Segment: a labeled, reusable group defined by logic (e.g., "High LTV customers" or "New users last 30 days"). A segment can be applied as a filter later.
Mental model
Think of a funnel with three gates:
- Row-level gates (dimension filters): choose which raw rows enter (e.g., event_name = "purchase").
- Aggregation: compute metrics (e.g., SUM(revenue) by region).
- Post-aggregation gates (measure filters): keep or drop groups based on metrics (e.g., only regions where SUM(revenue) > 10,000).
Common clarifications you must confirm
- Time: fixed vs relative dates, time zone, calendar type (calendar month vs fiscal month).
- Logic: AND vs OR between rules, inclusion vs exclusion, null handling.
- Levels: row-level filter vs post-aggregation filter (HAVING), and which entity level (order, user, session).
- Definitions: exact meaning of "active", "new", "test traffic", and list membership (e.g., what counts as EU?).
- Reusability: one-off filter vs named segment others can reuse.
Worked examples
Example 1: "Active subscribers in EU last quarter"
Ambiguity: What is "active"? Which countries are in EU? What is "last quarter" and time zone?
Resulting spec (sample)
- Date filter: subscription_payment_date in last full calendar quarter (UTC).
- Geo filter: billing_country in {AT, BE, BG, HR, CY, CZ, DE, DK, EE, ES, FI, FR, GR, HU, IE, IT, LT, LU, LV, MT, NL, PL, PT, RO, SE, SI, SK}.
- Active definition: user_id with COUNT(successful_subscription_payments) >= 1 during the quarter.
- Aggregation level: distinct user_id count.
Example 2: "Exclude internal traffic"
Ambiguity: Which IPs/domains? Logged-in staff only? What about staging environments?
Resulting spec (sample)
- Filter: exclude where ip_in_internal_ranges = true OR email_domain = "company.com" OR environment = "staging".
- Apply at: row-level (events) before aggregations.
Example 3: "High-value customers" segment
Ambiguity: Threshold, currency, period, refunds handling.
Resulting spec (sample)
- Segment logic: include user_id where lifetime_net_revenue_usd >= 500.
- Refund handling: exclude refunded amounts from revenue at transaction date.
- Nulls: if no transactions, user not included.
Steps to clarify any filter/segment
- Pin the time window
- Name the entity and level: user, session, order, product
- Separate row-level vs post-aggregation filters
- Write inclusion/exclusion logic with AND/OR
- Specify null handling
- Lock the lists: regions, channels, statuses
- Decide reusability: ad-hoc filter or named segment
- Write acceptance checks: small samples you can verify
Exercises and practice checklist
Do these exercises here and in the Exercises panel below.
Exercise 1: Define filters + a segment
Request: "Show active subscribers in EU last quarter and create a 'high-risk churners' segment for future use." Produce a written, testable spec with all clarifications.
Exercise 2: Post-aggregation vs row filter
Request: "Top 10 products by revenue last 30 days, excluding refunds, but only count orders with AOV > 50." Specify exactly how to filter and rank.
Practice checklist
Common mistakes and self-check
- Mistake: Using "last month" but including the current partial month. Self-check: Did you say "last full calendar month"?
- Mistake: Mixing OR and AND implicitly. Self-check: Did you add parentheses to group logic?
- Mistake: Filtering on aggregated metrics as if they were row-level. Self-check: Does your BI tool support post-aggregation (HAVING) for that visual?
- Mistake: Ignoring time zone misalignment. Self-check: Are all timestamps converted to the reporting zone?
- Mistake: Forgetting nulls. Self-check: What happens to rows with null country/channel?
- Mistake: Unversioned reusable logic. Self-check: Did you add a version suffix to named segments?
Practical projects
- Define and implement a "New customers" segment with 3 alternative definitions; compare outputs and document the chosen one.
- Build a dashboard tile using a measure filter (post-aggregation) and document how it differs from a row-level equivalent.
- Create an "Internal traffic" exclusion rule that combines IP ranges, user roles, and environments; validate with sample data.
Mini challenge
Draft a one-paragraph spec for "Weekly active users by region — last 8 weeks, exclude bots" that includes: time window definition, entity level, bot exclusion rules, and whether regions with zero users should appear.
Who this is for
BI Developers, Analytics Engineers, and Data/BI Analysts who translate business questions into reliable dashboards and datasets.
Prerequisites
- Basic SQL (WHERE vs HAVING, GROUP BY)
- Familiarity with your BI tool's filter types and time functions
- Understanding of core business entities (users, orders, sessions)
Learning path
- Clarify time and entity levels consistently
- Master row-level vs post-aggregation filters
- Define and version reusable segments
- Practice acceptance checks and edge cases
Next steps
- Complete the exercises below
- Take the quick test to confirm understanding
- Apply the checklist to your next stakeholder request