Performance And Load Testing Basics

Learn Performance And Load Testing Basics for free with explanations, exercises, and a quick test (for Platform Engineer).

Published: January 23, 2026 | Updated: January 23, 2026

Who this is for

Hands-on: minimal step plan

Define target: SLOs, expected/peak RPS, endpoints mix.
Choose model: Open (arrival rate) or Closed (users + think time).
Prepare data: fixtures, idempotency, warm-up strategy.
Set profile: ramp-up, steady duration, ramp-down.
Define pass/fail criteria: percentiles, error rate, resource headroom.
Enable observability: request traces, metrics, logs by test label.

Exercises

Exercise 1: Concurrency and RPS sanity check

Service SLO: p95 ≤ 250 ms at 300 RPS. Recent averages show 120 ms. Calculate expected concurrency and propose a ramp-up profile for a 15-minute load test.

When done, compare with the solution below.

Exercise 2: Draft a minimal load test plan

CRUD API with endpoints: GET 70%, POST 20%, PUT 5%, DELETE 5%. Expected steady load 400 RPS, peak 650 RPS. Draft: test type, workload model, target RPS, ramp-up, steady time, data setup, success/abort criteria.

Show checklist before running

Test data prepared, idempotent where needed
Observability labels/tags set for this run
Warm-up period planned
Pass/fail and abort criteria written
Environment close to production
Dependencies (DB/cache) capacity and limits known

Common mistakes and self-checks

Mistake: Using only average latency. Self-check: Always record p95/p99 and percentile histograms.
Mistake: Unrealistic data (tiny payloads, no cache churn). Self-check: Match payload sizes and request mix to reality.
Mistake: No warm-up. Self-check: Add 2–5 minutes ramp and discard warm-up metrics.
Mistake: Ignoring dependencies. Self-check: Monitor DB/cache/queue metrics alongside service.
Mistake: Closed model when traffic is open. Self-check: Choose model that reflects real arrivals.
Mistake: No abort criteria. Self-check: Define error/latency thresholds to stop runs safely.

Practical projects

Capacity card: Document safe RPS, p95, and resource headroom for one service. Update after each release.
Spike safety net: Create a repeatable spike test profile for a key endpoint with clear pass/fail conditions.
Soak sentinel: Run a 4-hour soak in staging and chart memory/FD growth and GC behavior.

Learning path

Start: This lesson — concepts, models, and plans.
Next: Advanced workload modeling (think times, distributions, burstiness).
Later: Automated performance gates in CI and regression detection.

Mini challenge

Your service meets SLO at 400 RPS, but at 500 RPS p99 latency spikes and DB CPU hits 90%. Propose two changes: one at app level (e.g., connection pooling, caching, batching) and one at DB level (e.g., index, read-replica, pool size). State how you’d validate with a test plan.

Next steps

Pick one project above and run a small load test against a staging-like environment.
Record a baseline: p50/p95/p99, RPS, error rate, CPU, memory, DB latency.
Share findings with your team and agree on pass/fail criteria for future releases.

Quick test note: The quick test is available to everyone; only logged-in users get saved progress.

Practice Exercises

2 exercises to complete

Instructions

Given: target 300 RPS, average latency 120 ms, SLO p95 ≤ 250 ms. Compute expected concurrency using Little’s Law. Propose a ramp-up for a 15-minute load test including warm-up, steady, and ramp-down.

Write your concurrency estimate and a simple timeline.

Expected Output

Concurrency ≈ 36. Timeline: 3 min ramp-up to 300 RPS, 10 min steady, 2 min ramp-down. Discard first 2–3 minutes as warm-up when judging SLO.