luvv to helpDiscover the Best Free Online Tools
Topic 7 of 12

Identifying Bottlenecks

Learn Identifying Bottlenecks for free with explanations, exercises, and a quick test (for Business Analyst).

Published: December 20, 2025 | Updated: December 20, 2025

Who this is for

You are a Business Analyst mapping processes in BPMN and need to spot where work piles up, customers wait, and teams struggle. This subskill helps you find the true constraint and suggest practical fixes.

Prerequisites

  • Basic BPMN: tasks, events, gateways, lanes, message flows
  • Know cycle time, throughput, and work-in-progress (WIP)
  • Comfort with simple arithmetic for rates and capacities

Why this matters

Real tasks you will do:

  • Annotate BPMN with timings and volumes to find the slowest step
  • Explain why queues form and what to change first
  • Estimate impact (minutes saved, capacity gained) before stakeholders invest
  • Prioritize fixes across teams without local sub-optimizations

Concept explained simply

A bottleneck is the step where demand consistently exceeds capacity, so work waits. In BPMN, it often shows up at tasks after a merge, before a specialized role, or around rework loops and waiting events.

Mental model: Flow like water through pipes
  • Arrival rate: how fast water enters (items per hour)
  • Processing rate (capacity): how fast a pipe section passes water
  • If arrival rate ≥ capacity at a section, water backs up there (queue forms)
  • System speed is set by the narrowest pipe (the true bottleneck)

Helpful relationship: when stable, WIP ≈ Throughput × Cycle Time. If WIP balloons at one step, either its throughput is too low or variability is too high.

How to spot bottlenecks in BPMN

  1. Scope clearly: Mark start and end events you are analyzing.
  2. Annotate the map: For each task, note average handle time, resources, hours of availability, arrival volume.
  3. Check capacity vs demand: Capacity ≈ (available hours × number of resources) ÷ handle time. Compare to arrival rate.
  4. Check gateways: After a parallel split, each branch must have sufficient capacity. After merges, watch for queues before the next single lane.
  5. Find waiting and rework: Timers, message waits, and error loops often hide delays.
  6. Look at variability: High variability steps cause intermittent queues even if averages look okay. Prefer percentiles (e.g., P80) over only averages.
Clues in the diagram
  • Long queues before a single specialized task/lane
  • Rework loop used frequently (error boundary events)
  • Timer events that pause flow without parallelization
  • Multiple handoffs across lanes with no buffer policy

Worked examples

Example 1: Loan approval

Data per hour: 30 applications arrive. Tasks:

  • Data entry: 3 clerks × 0.08 h/app ≈ 37.5 cap/hr
  • Credit check (specialist): 1 analyst × 0.25 h/app = 4 cap/hr
  • Auto-decision gateway: 40% auto-approve; 60% to manual review
  • Manual review: 2 analysts × 0.5 h/app = 4 cap/hr

Flow: 30 arrive → after credit check (4/hr capacity) queue forms. Manual review gets 60% of 30 = 18/hr, but capacity is 4/hr, so even bigger queue. Bottleneck: manual review. Fixes: tighten auto-decision rules to reduce manual load; add analysts; triage simple cases to a faster lane.

Example 2: E-commerce returns

20 returns/hour. Steps:

  • Receive package: 2 workers × 0.1 h = 20 cap/hr
  • Quality inspection: 1 worker × 0.3 h ≈ 3.3 cap/hr
  • Refund processing (system): 60 cap/hr

Queue before inspection explodes because 20 > 3.3. Bottleneck: inspection. Fixes: parallelize inspection stations; pre-triage low-risk items to spot-check; standardize to cut handle time to 0.2 h (raising capacity).

Example 3: IT service desk

Tickets: 12/hr. Steps:

  • Triage: 1 agent × 0.1 h = 10 cap/hr
  • Auto-solve (script): 5/hr siphoned off
  • Specialist queue: 2 specialists × 0.4 h = 5 cap/hr

Effective demand for specialists ≈ 12 − 5 = 7/hr, capacity 5/hr → queue grows there. Bottleneck: specialists. Fixes: expand automation to 7/hr, or add 1 specialist (capacity to 7.5/hr), or introduce a swarming policy for spikes.

Step-by-step playbook

  1. Map the current state BPMN with lanes and gateways.
  2. Attach data labels: arrival rate, handle time, % split at gateways, rework rate, resource counts, availability hours.
  3. Compute each task’s hourly capacity and utilization (util ≈ demand ÷ capacity). Flag util > 85% as risky.
  4. Trace token flow through splits/merges to estimate branch demand.
  5. Validate with observed queues and SLA breaches.
  6. Prioritize bottlenecks by impact (delay, cost) vs effort (people, policy, tooling).
  7. Draft countermeasures: eliminate, simplify, parallelize, automate, add capacity, reduce variability, change batch size.
  8. Simulate small changes on the map (what-if) and communicate expected gains.
Countermeasure cheat sheet
  • Eliminate: remove non-value steps, redundant approvals
  • Simplify: templates, checklists, standard work
  • Parallelize: perform waits in parallel with prep tasks
  • Automate: scripts, rules engines, integrations
  • Add capacity: cross-train, temporary staffing, shift tweaks
  • Reduce variability: appointment windows, triage, SLAs
  • Batch smartly: small batches reduce waiting for joins

Common mistakes

  • Optimizing non-bottlenecks: speeding a fast step doesn’t fix the queue.
  • Using only averages: ignores spikes; check P80/P90 times when possible.
  • Ignoring merges: branches may be fine alone but jam at a merge before a single resource.
  • Misreading rework: frequent loops mean hidden demand; include it in capacity math.
  • Forgetting availability: lunch, meetings, outages reduce real capacity.
  • Counting touches not waits: the pain is usually in queues, not handle time.
Self-check
  • Can you point to the exact BPMN element where WIP accumulates?
  • Can you show demand vs capacity numbers for it?
  • If you remove that constraint, where would the next one appear?

Exercises

These mirror the exercises below. Do them here, then open the solution when ready.

Exercise 1: Spot the bottleneck from data

Given: 24 items/hour arrive.

  • Task A (2 clerks, 0.15 h each)
  • Exclusive gateway: 50% to Task B, 50% to Task C
  • Task B (1 specialist, 0.4 h)
  • Task C (1 specialist, 0.3 h)
  • Merge then Final Approval (1 manager, 0.5 h)

Question: Which element is the bottleneck and why? Suggest one fix.

Show solution

Capacities:

  • A: 2 × (1/0.15) ≈ 13.3/hr; demand 24/hr → A itself queues unless upstream throttles (true system inconsistency). If A is the entry point, A is already a bottleneck.
  • B demand: 12/hr, capacity 1/0.4 = 2.5/hr → queues badly.
  • C demand: 12/hr, capacity 1/0.3 ≈ 3.3/hr → also queues.
  • Final Approval demand: 24/hr, capacity 1/0.5 = 2/hr → absolute bottleneck.

Primary bottleneck: Final Approval (2/hr) since the system cannot exceed this rate. Fix: delegate standard cases, add another manager, or automate approvals under a threshold.

Exercise 2: Gateway-induced congestion

A parallel gateway splits work equally into three tasks D, E, F. Capacities are: D = 8/hr, E = 8/hr, F = 4/hr. Arrival is 12/hr. After the split, where does WIP build and what BPMN tweak helps without adding people?

Show solution

Branch demand ≈ 4/hr each. D and E can handle it; F is at its limit and vulnerable to variability. WIP appears before F and at the join if synchronization is required. Tweak: make F non-synchronizing with the others (asynchronous completion if business rules allow), or rebalance by routing less to F via an exclusive gateway rule (e.g., 2/hr to F, 5/hr to D, 5/hr to E).

Practice checklist

Practical projects

  1. Choose one live process (e.g., onboarding). Build a BPMN current state with data annotations. Identify the bottleneck and propose three fixes with rough ROI.
  2. Take a process with a timer wait. Redesign to parallelize prep work during the wait. Estimate cycle time reduction.
  3. Create a simple what-if analysis: show how adding 1 resource at the bottleneck changes throughput and where the next bottleneck appears.

Learning path

  • Before: BPMN basics → Data collection for processes
  • Now: Identifying Bottlenecks
  • Next: Prioritizing Improvements, Simulation/What-if, Designing Future State BPMN

Next steps

  • Run the quick test below to check understanding. Everyone can take it; only logged-in learners will see saved progress.
  • Apply one countermeasure in a small pilot and observe changes in WIP and cycle time.
  • Update your BPMN with learned parameters and note the new constraint.

Mini challenge

You have a process where arrival is 10/hr. After a gateway, 70% go to a task with 1 person at 0.5 h per item; 30% go to a task with 1 person at 0.25 h per item. There is a merge and then a final task at 0.33 h per item with 1 person. Identify the bottleneck and one no-cost change to ease it.

Suggested answer

Capacities: Branch A demand 7/hr, capacity 2/hr → queues. Branch B demand 3/hr, capacity 4/hr → fine. Final task capacity ≈ 3/hr; system cap is min(2, 3) = 2/hr, so Branch A is primary bottleneck. No-cost change: route more items to Branch B if rules allow, or tighten auto-criteria to reduce Branch A load.

Practice Exercises

2 exercises to complete

Instructions

Given 24 items/hour arrive. Tasks:

  • Task A: 2 clerks, 0.15 h per item
  • Exclusive gateway: 50% to Task B, 50% to Task C
  • Task B: 1 specialist, 0.4 h per item
  • Task C: 1 specialist, 0.3 h per item
  • Merge then Final Approval: 1 manager, 0.5 h per item

Question: Which element is the bottleneck and why? Suggest one fix.

Expected Output
Identify Final Approval as the primary bottleneck (lowest capacity vs inflow), with a clear reason and one plausible countermeasure.

Identifying Bottlenecks — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Identifying Bottlenecks?

AI Assistant

Ask questions about this tool