Who this is for
Backend and platform engineers who want to ship reliable services, write meaningful tests, and diagnose bugs quickly without guesswork.
- Junior engineers learning solid habits.
- Mid-level engineers aiming to reduce incident time-to-resolution.
- Anyone who wants fewer flaky tests and faster feedback loops.
Prerequisites
- Comfort writing basic functions, handling HTTP requests, and using a database.
- Ability to run a local service and unit tests (any language/framework).
- Basic understanding of logs and environment configuration.
Why this matters
Real backend tasks depend on a testing mindset and clear debugging:
- Reproducing production bugs locally to create a regression test and a safe fix.
- Triaging incidents: narrowing scope fast, rolling out minimal changes, and verifying with checks.
- Designing tests that prevent costly regressions in auth, billing, caching, and concurrency.
- Keeping CI fast and helpful so teams trust it.
Concept explained simply
Testing mindset: you design code and tests together so failures are informative, quick, and local. You aim to disprove assumptions early, not prove perfection.
Debugging: a disciplined loop — Observe → Hypothesize → Test → Narrow. You change only one variable per loop to learn quickly.
Mental model
- Test pyramid: many small unit tests at the base, fewer integration tests, very few end-to-end tests. Keep most feedback fast.
- Golden path + edges: test the most common path and the riskiest edges (nulls, time, concurrency, I/O errors).
- Debugging funnel: start broad (symptoms), funnel down (smallest failing case), fix, then add a regression test.
Quick heuristics you can use today
- Freeze time and randomness in tests (inject clock/seed).
- Prefer Arrange–Act–Assert structure and Given–When–Then naming.
- Binary search the fault: disable halves of code/inputs to isolate.
- Check last change first (deploys, config, data shape).
- Log at boundaries with structured fields, not free text.
Core techniques (fast reference)
- Test types: Unit (pure logic), Integration (DB/queue), Contract (consumer/provider expectations), E2E (full flow).
- Designing tests: small, independent, deterministic; one reason to fail; name describes behavior not implementation.
- Oracles: exact output, properties (idempotent, monotonic), invariants (balances never negative), and state transitions.
- Debugging tools: logs, metrics, traces, local reproduction, feature flags, bisecting commits, cpu/mem profiles for perf bugs.
Worked examples
1) Flaky time-based test
Symptom: a test passes locally, fails in CI around midnight UTC.
// Before (pseudocode)
assert(formatInvoiceDate(now()) == "2026-01-20")
Fix: inject a clock and freeze it.
// After
const fixedClock = Clock.fixed("2026-01-20T10:00:00Z");
assert(formatInvoiceDate(fixedClock.now()) == "2026-01-20")
Lesson: never call real time in tests.
2) 500 on PATCH /users/:id with optional name
Logs show null pointer on name.trim() when name is omitted.
// Before
if (body.name.trim().length > 0) updateName(body.name)
Fix and test:
// After
const name = body.name ?? null;
if (name != null && name.trim().length > 0) updateName(name)
- Integration test: missing name returns 200 and keeps existing name.
- Unit test: update logic no longer throws when name is null.
3) Deadlock in double update
Two transactions update rows in different order, causing a deadlock.
Approach: reproduce with a small concurrency test and enforce a consistent lock order (e.g., lock lower id first). Add a stress test that runs N parallel updates to ensure no deadlock.
4) Stale cache after update
Symptom: GET returns old data after a successful PUT.
Fix: write-through or explicit invalidation on write, then a contract test to ensure any update operation invalidates cache keys for that entity.
Exercises
Do these now. They mirror the graded exercises below and build core habits.
Exercise 1 — Reproduce and pin a boundary bug with a unit test
Context: A pagination helper computes an offset from page and pageSize.
// Expected behavior (pseudocode)
// offset = (page - 1) * pageSize
// page must be >= 1; pageSize > 0
- Write a unit test named returns_zero_offset_for_page_1 that asserts offset is 0 when page=1.
- Add a test named rejects_page_zero that expects an error for page=0.
- Run tests. If the second test fails because no error is thrown, you have reproduced the bug.
- Implement the minimal fix and re-run tests.
Hints
- Keep tests independent: no shared globals.
- Use Arrange–Act–Assert to structure each test.
- Consider a property: for page >= 1, offset should be >= 0.
Exercise 2 — Debug a 500 caused by optional fields
You see this log:
PATCH /users/42 500
error: NullPointerException at updateUser
at trim() on undefined
request body: {"email":"a@b.com"} // name omitted
- Write a hypothesis for the root cause in one sentence.
- Propose the smallest safe code change to prevent the 500 while preserving behavior.
- Write:
- A unit test for the update logic when name is missing.
- An integration test ensuring status 200, email updates, and name remains unchanged.
Hints
- Guard optional fields at the boundary (controller/DTO).
- Prefer null-safe reads over defaulting to empty string unless required.
- Name your tests using Given–When–Then for clarity.
- [ ] Tests are deterministic (no real time/randomness).
- [ ] Each test has one clear reason to fail.
- [ ] You can explain what the failure teaches you.
Common mistakes and self-check
- Testing implementation, not behavior. Self-check: if you refactor internals, do your tests still pass? They should.
- Overusing E2E tests. Self-check: does your suite run in < 10 minutes? If not, pyramid might be inverted.
- Ignoring nondeterminism. Self-check: do tests use real time, sleep, or network? Remove or fake them.
- Changing multiple variables while debugging. Self-check: did you change exactly one thing between observations?
- Missing regression tests after a fix. Self-check: can this bug reappear silently? Add a targeted test.
Practical projects
- Project A: Introduce a clock interface into a service and refactor all time-based logic to use it. Add 3 unit tests that freeze time.
- Project B: Add consumer–provider contract tests between two services (e.g., user and billing). Break the contract intentionally, watch the consumer test fail, then fix.
- Project C: Build a minimal reproduction kit for one known incident: dockerized DB with seed, one script to run the failing scenario, and a regression test.
Learning path
- Start with unit tests for core business rules.
- Add integration tests for DB/queue boundaries.
- Introduce contract tests to stabilize service interactions.
- Reserve a small number of E2E tests for critical flows.
- Continuously refactor tests for clarity and speed.
Next steps
- Pick one flaky test in your codebase and make it deterministic.
- Add one regression test for a past production bug.
- Document your debugging loop for the team (steps and tools).
Mini challenge
Your CI suite sometimes fails with a DB unique constraint error in a test that creates a user. You cannot reproduce locally. In your next attempt, what is the most effective single change?
Suggested approach
- Seed random with a fixed value and include a unique suffix in test data; or use DB transaction rollback per test.
- Add structured logs around user creation including the username used.
Ready for the Quick Test?
The quick test is available to everyone. Only logged-in users have their progress saved.