Who this is for
Hands-on: minimal step plan
- Define target: SLOs, expected/peak RPS, endpoints mix.
- Choose model: Open (arrival rate) or Closed (users + think time).
- Prepare data: fixtures, idempotency, warm-up strategy.
- Set profile: ramp-up, steady duration, ramp-down.
- Define pass/fail criteria: percentiles, error rate, resource headroom.
- Enable observability: request traces, metrics, logs by test label.
Exercises
Exercise 1: Concurrency and RPS sanity check
Service SLO: p95 ≤ 250 ms at 300 RPS. Recent averages show 120 ms. Calculate expected concurrency and propose a ramp-up profile for a 15-minute load test.
When done, compare with the solution below.
Exercise 2: Draft a minimal load test plan
CRUD API with endpoints: GET 70%, POST 20%, PUT 5%, DELETE 5%. Expected steady load 400 RPS, peak 650 RPS. Draft: test type, workload model, target RPS, ramp-up, steady time, data setup, success/abort criteria.
Show checklist before running
- Test data prepared, idempotent where needed
- Observability labels/tags set for this run
- Warm-up period planned
- Pass/fail and abort criteria written
- Environment close to production
- Dependencies (DB/cache) capacity and limits known
Common mistakes and self-checks
- Mistake: Using only average latency. Self-check: Always record p95/p99 and percentile histograms.
- Mistake: Unrealistic data (tiny payloads, no cache churn). Self-check: Match payload sizes and request mix to reality.
- Mistake: No warm-up. Self-check: Add 2–5 minutes ramp and discard warm-up metrics.
- Mistake: Ignoring dependencies. Self-check: Monitor DB/cache/queue metrics alongside service.
- Mistake: Closed model when traffic is open. Self-check: Choose model that reflects real arrivals.
- Mistake: No abort criteria. Self-check: Define error/latency thresholds to stop runs safely.
Practical projects
- Capacity card: Document safe RPS, p95, and resource headroom for one service. Update after each release.
- Spike safety net: Create a repeatable spike test profile for a key endpoint with clear pass/fail conditions.
- Soak sentinel: Run a 4-hour soak in staging and chart memory/FD growth and GC behavior.
Learning path
- Start: This lesson — concepts, models, and plans.
- Next: Advanced workload modeling (think times, distributions, burstiness).
- Later: Automated performance gates in CI and regression detection.
Mini challenge
Your service meets SLO at 400 RPS, but at 500 RPS p99 latency spikes and DB CPU hits 90%. Propose two changes: one at app level (e.g., connection pooling, caching, batching) and one at DB level (e.g., index, read-replica, pool size). State how you’d validate with a test plan.
Next steps
- Pick one project above and run a small load test against a staging-like environment.
- Record a baseline: p50/p95/p99, RPS, error rate, CPU, memory, DB latency.
- Share findings with your team and agree on pass/fail criteria for future releases.
Quick test note: The quick test is available to everyone; only logged-in users get saved progress.