Why this matters
Real BI Developer tasks that benefit from folding:
- Filtering fact tables to the last 12–24 months before any non-foldable steps.
- Merging large dimension tables on the source side instead of in memory.
- Aggregating (GROUP BY, SUM, COUNT) inside the database instead of on your desktop/server.
- Parameterizing date ranges so only needed rows are queried.
Concept explained simply
Query folding means your step-by-step transformations are translated into a single native query at the source (for example SQL). If a step can be translated, it "folds." If not, folding breaks and later steps typically run locally.
Mental model
Imagine a conveyor belt. Foldable steps push work to the factory (database) at the start of the belt. A non-foldable step places a roadblock: everything after it is done by hand on your side. Keep the roadblock as late as possible or avoid it.
How to tell if folding is happening (typical signs)
- Fold indicators or a "view native query" option usually appear when folding is intact. If indicators disappear or the native query is disabled, folding is likely broken.
- Moving a non-foldable step further down often restores folding above it.
Exact UI varies by tool, but the principle is the same.
What commonly folds vs breaks
Often folds (source/connector dependent)
- Select/Remove columns
- Filter rows (comparisons, ranges, In/Not In via joins)
- Joins/Merges on keys
- Aggregations (Group By with sum, count, min, max, average)
- Simple sorts (if supported by source)
- Type changes that map to source types
Often breaks folding
- Row-by-row custom functions or complex custom columns
- Index columns generated locally
- Buffering entire tables (e.g., buffering/force materialization)
- R/Python steps
- Operations not supported by the connector or source (some text parsing, complex date logic)
- Converting to list or performing steps that remove tabular context
Note: Exact folding capabilities depend on the data source and connector. When in doubt, push filters/joins/aggregations as early as possible and move non-foldable logic as late as possible.
Worked examples
Example 1: Filter early, compute late
Goal: Load Orders from a SQL source, last 12 months only, then add a complex custom column.
Better order (folds most work):
1) Source (Orders)
2) Changed Type (supported types)
3) Filter Rows: OrderDate >= Today - 365 ← folds
4) Remove Columns you don’t need ← folds
5) Group By to aggregate if needed ← folds
6) Add Custom Column (complex) ← non-foldable but now on far fewer rowsResult: The source returns only needed rows; the custom column runs on a small set.
Example 2: Join without breaking folding
Goal: Join Orders and Customers on CustomerID.
Do NOT add an Index Column before the join (often breaks folding).
Do this instead:
1) Source (Orders), Source (Customers)
2) Select only required columns
3) Filter rows to needed date range
4) Merge on CustomerID (folds)
5) Expand needed fields
6) Add Index Column (if really needed) lastResult: The join and filters happen in the database; only the final lightweight step is local.
Example 3: Text transformations on large tables
Goal: Compute Text.Length(ProductName) on Sales.
Approach A (slow): Add Text.Length early on millions of rows → breaks folding → large data pulled locally.
Approach B (better): Filter and aggregate first (fold), or join to Products first (fold), then compute Text.Length after reducing rows.Who this is for
- BI Developers who build refreshable models and reports.
- Anyone using Power Query–style transformations against SQL or similar sources.
Prerequisites
- Basic familiarity with importing data and applying query steps.
- Comfort with filtering, joining, and grouping data.
Learning path
- Understand what query folding is and why it matters.
- Learn which steps usually fold and which often break it.
- Practice ordering steps to keep folding intact.
- Add non-foldable steps only after reducing data size.
- Review performance and iterate.
Hands-on exercises
Do these now. They mirror the exercises below and prepare you for the quick test.
Exercise 1 (detect and fix)
Given the step order: Source → Changed Type → Added Index → Filtered Rows → Removed Columns → Grouped Rows. Rearrange to preserve folding for filter/join/aggregation operations and explain why.
- [ ] Write your optimized order.
- [ ] Mark which steps fold.
- [ ] Note why the index should be last.
Exercise 2 (fold-friendly join)
Tables: Sales and Products in a SQL source. You need: current-year SalesAmount by Category. Steps available: Filter Sales to current year, Select columns, Merge with Products on ProductID, Expand Category, Group and Sum, Add Text.Length(ProductName) as a helper (non-essential). Place the Text.Length step to avoid breaking folding prematurely.
- [ ] Write your step order.
- [ ] Identify the first non-foldable step, if any.
- [ ] Explain how your ordering reduces data early.
Folding-aware build checklist
- [ ] Push filters up: date/keys first.
- [ ] Select only necessary columns before joins.
- [ ] Do joins and group-bys before custom row-by-row logic.
- [ ] Add indexes or custom functions last.
- [ ] If folding breaks, try moving the breaking step later or branching from a staged, foldable query.
Common mistakes and how to self-check
- Mistake: Adding Index or complex custom columns early. Fix: Move them to the end.
- Mistake: Performing text parsing before filtering. Fix: Filter rows and columns first.
- Mistake: Assuming a step folds because it’s simple. Fix: Check folding indicators or try reordering to confirm.
- Mistake: Sorting huge tables when the source can’t push down sort. Fix: Sort only if needed and after reducing rows, or rely on source-side ordering if supported.
- Mistake: Buffering entire tables. Fix: Avoid buffering unless you have a specific reason and small data.
Practical projects
- Build a staging query that only filters and selects columns (all foldable), then reference it for final modeling steps.
- Parameterize a date range so refresh retrieves only recent data, and confirm folding remains intact after parameterization.
- Take a slow query, identify the first non-foldable step, move it later, and measure refresh improvement.
Next steps
- Apply folding-aware ordering in your current BI project.
- Review other subskills in Performance Optimization (caching, incremental refresh, model tuning).
- Keep a personal checklist of fold-friendly steps for your common sources.
Mini challenge
You have: Orders (50M rows), Products (100k rows). Need total SalesAmount for the last 6 months by Brand and Category, plus a custom SKU hash column. Write a step order that keeps folding for filters/joins/aggregations and adds the SKU hash only after aggregation reduces rows.
Quick Test
Note: The quick test is available to everyone; only logged-in users have their progress saved.