Why this matters
Real systems face timeouts, dropped connections, and duplicate requests. As a Backend Engineer, you must make write operations safe to retry and ensure clients can recover from failures without corrupting data.
- Payments: avoid charging the customer twice if the client retries.
- Orders/Inventory: prevent double reservation on network glitches.
- Email/Notifications: deduplicate to avoid spam bursts.
- External APIs: handle transient 5xx, 429, and timeouts gracefully.
Concept explained simply
Idempotency means: doing the same operation multiple times leads to the same result as doing it once. Retries mean: automatically trying again when a call may have failed transiently.
Mental model
Think of each write request as having a fingerprint (an idempotency key). Your service keeps a short-term ledger of fingerprints and outcomes. If the same fingerprint shows up again, the ledger returns the original outcome instead of performing the action twice.
Translate to practice
- Client generates an idempotency key for risky writes (e.g., POST /payments).
- Server stores the key, request hash, and the first successful response for a retention window (e.g., 24–72 hours).
- On duplicates: return the saved response if the request matches; otherwise return a 409-style conflict.
- Retries use exponential backoff + jitter; only retry safe conditions.
Core principles you will use
- Use idempotency keys for non-idempotent writes (typically POST).
- Prefer naturally idempotent methods when possible (PUT, DELETE).
- Design a deduplication store (key, request hash, response, status, expiry).
- Set client and server timeouts; ensure server can finish work once accepted.
- Retry with exponential backoff and jitter; cap attempts; use a retry budget.
- Retry only on transient errors (5xx, 408, connection resets, 429 with respect to Retry-After).
- Make side effects atomic or compensatable (e.g., outbox pattern, unique constraints).
Worked examples
Example 1: Idempotent payment creation with POST
Goal: Ensure repeated POST /payments does not double-charge.
- Client sends POST /payments with header Idempotency-Key: k-123 and body {amount: 4200, currency: "USD", source: "tok_x"}.
- Server creates a row in idempotency_keys with key=k-123 and a hash of the normalized request body.
- Server processes payment once; stores final response payload and status in the same row; marks as completed.
- If a duplicate arrives with key=k-123 and the same request hash: return the stored response (e.g., 201 with payment_id=pmt_1).
- If a duplicate arrives with a different request hash: return a 409 conflict indicating the key was used with different parameters.
Table idempotency_keys
- key (PK)
- request_hash
- status_code
- response_body (json)
- created_at
- expires_at
- processing_state (pending|completed|failed)
Example 2: Safe retries for a flaky downstream
Scenario: POST /invoices calls an external tax service that occasionally times out.
- Server calls tax service with a client timeout of 2s.
- On timeout or 5xx, retry the downstream call with exponential backoff + jitter (e.g., 200ms, 400ms, 800ms randomization), max 4 attempts.
- Protect the whole invoice creation with an idempotency key so that if the client retries the original API, the invoice is not duplicated.
- If downstream still fails, return a 503 with a stable error code; client may retry its original request using the same key.
Example 3: Using PUT to create-or-replace
Scenario: You need a stable invoice ID from the client side.
- Client calls PUT /invoices/INV-2025-0001 with a full representation of the invoice.
- PUT is idempotent: repeating the same request sets the resource to the same state.
- If you must ensure create-once semantics, use a unique constraint on id and treat repeated PUTs as upsert operations that converge.
Example 4: Deleting safely
DELETE /carts/abc should be idempotent. First call removes the cart and returns 204. Subsequent calls return 204 again (no error) and do nothing. This makes client retries safe.
Design checklist
- Do I have an idempotency strategy for every write endpoint?
- Where is the dedup store, and how long do keys live?
- What errors are retried, with what backoff and jitter?
- Are timeouts set so the server can finish once started?
- Are side effects atomic or protected with unique constraints/outbox?
- What do I return for duplicate-with-different-params? (409 conflict)
Your turn — exercises
These mirror the exercises below. Write your answers, then compare with the solutions.
Exercise 1: Idempotent payment endpoint
Design POST /payments to be idempotent.
- Propose an Idempotency-Key format and retention window.
- Draft a table schema for the idempotency store.
- Describe request flow: first request, duplicate same params, duplicate different params.
- Describe error handling for concurrent requests with same key.
- State what status codes you return in each case.
Exercise 2: Retry policy blueprint
Specify a retry policy for your HTTP client calling an external service.
- List which status codes and network errors to retry.
- Define base delay, backoff factor, jitter, max attempts, and total budget.
- State idempotency requirements for retried requests.
- Explain how you honor 429 Retry-After.
- Self-check: Can a client safely resend the same write without duplication?
- Self-check: Do retries avoid thundering herds (jitter) and respect server limits?
Common mistakes and how to self-check
- Using POST without idempotency keys for sensitive operations. Self-check: Can I safely hit refresh on the client and get the same outcome?
- Retrying all 4xx responses. Self-check: Only retry 408/429/409 (when safe) and some gateway 4xx; do not retry 400/401/403/404.
- No jitter in backoff. Self-check: Are retries aligned at the same timestamps? If yes, add jitter.
- Ignoring concurrency on the same key. Self-check: Do I lock or serialize processing per key to avoid double work?
- Storing idempotency keys without request hash. Self-check: Can I detect conflicting uses of the same key?
- Letting keys expire too soon. Self-check: Does expiration cover the longest realistic client retry window?
Practical projects
- Build an idempotent POST /payments mock service with an in-memory dedup store, then swap to a real database.
- Add a retrying HTTP client wrapper to call a flaky sandbox API; implement exponential backoff with full jitter.
- Convert a POST create endpoint to PUT with client-provided IDs and compare operational behavior under load.
Who this is for
- Backend Engineers building HTTP/JSON APIs.
- Developers integrating with third-party payment, messaging, or tax services.
- Engineers responsible for reliability and correctness in distributed systems.
Prerequisites
- HTTP methods and status codes (2xx, 4xx, 5xx basics).
- Familiarity with RESTful resource design.
- Basic database schema design and unique constraints.
- Comfort with timeouts and retry logic in your programming language.
Learning path
- Identify write endpoints that can cause double effects (payments, orders, emails).
- Add idempotency keys and a dedup store for the riskiest one.
- Implement exponential backoff + jitter and a retry budget in your HTTP client.
- Harden concurrency handling (locking/serialization per key).
- Instrument and log duplicates, conflicts, and retries; adjust retention and policies.
Next steps
- Apply the same patterns to message queues (deduplication and exactly-once-processing illusions).
- Introduce the outbox pattern for reliable side effects.
- Run chaos tests: inject timeouts and 5xx to validate your behavior.
Mini challenge
Your inventory service uses POST /reserve. Under load tests with induced timeouts, some items get double-reserved. In one paragraph, describe how you would add idempotency keys, locking per key, and retry rules to eliminate double reservations while keeping throughput high.
Quick Test
The quick test is available to everyone. If you log in, your progress is saved automatically.