Who this is for
You design or integrate APIs that must notify other systems or handle long-running work. Ideal for API engineers building event-driven features, partners/webhook platforms, or background processing pipelines.
Prerequisites
- HTTP basics (methods, headers, status codes)
- REST/JSON familiarity
- Security basics (HMAC, TLS, secrets management)
- Optional: message queues and background workers
Why this matters
Real API ecosystems need to notify clients about changes (payments, shipments, document updates) and run jobs that take time (image/video processing, ML inference). Asynchronous APIs and webhooks let you:
- Push events to consumers reliably without polling
- Scale long-running work without blocking client requests
- Increase resilience with retries, idempotency, and dead-letter handling
Typical tasks in the role:
- Design a POST that immediately returns 202 Accepted plus a status URL
- Publish webhooks with signatures and well-defined retry policy
- Make event schemas versioned and idempotent
- Protect receivers against replay and duplicate deliveries
Concept explained simply
Asynchronous APIs decouple request and work. A client asks for work; the server accepts the request, queues it, and processes it later. The client checks status or receives a webhook when done.
Webhooks are HTTP callbacks. The producer sends an HTTP POST to a consumer's URL when something happens. Because the network is unreliable, producers retry on non-2xx responses. Consumers must handle duplicates safely.
Mental model
- Mailroom: A clerk (API) accepts packages (requests) and gives you a receipt (request_id + status URL). Processing happens in the back. Later, you get a message (webhook) or you check the receipt (status endpoint).
- Assume the messenger may deliver the same letter twice or out of order. Your system should still behave correctly.
Key design patterns and decisions
- Accept-work pattern: POST returns 202 Accepted, a
request_id, and astatus_url. - Status resource:
GET /requests/{id}returns state machine: queued -> processing -> succeeded/failed, with progress and result URLs. - Webhooks: POST JSON event with
event_id,type,occurred_at,data. Sign with HMAC over the raw body, includeSignatureheader and timestamp. - Retries: Exponential backoff with jitter. Stop after a max window; send to a dead-letter queue for manual review.
- Idempotency:
- Clients: send
Idempotency-Keyon POST. Server stores request hash and returns the same response on retries. - Webhook receivers: deduplicate using
event_idwith a TTL cache or durable store.
- Clients: send
- Security: verify signatures, rotate secrets, least privilege, short time window for replay.
- Versioning: include
spec_versionorevent_versionin payloads; avoid breaking changes. - Observability: correlation IDs, structured logs, metrics on latency, retries, failure rates.
When to use webhooks vs polling
- Webhooks: producer can reach consumer reliably; events are rare compared to possible polling rate.
- Polling: consumer cannot expose an endpoint; or firewall rules block inbound; or consumer needs tight control on timing.
- Hybrid: webhook for push + polling status as fallback.
Worked examples
Example 1: 202 + status URL for long-running job
POST /v1/images:process
Request headers: Idempotency-Key: 9f5a...
Body: {"image_url":"...","operations":[{"resize":{"w":800,"h":600}}]}
Response 202
{
"request_id": "req_123",
"status_url": "/v1/requests/req_123",
"estimated_seconds": 20
}
GET /v1/requests/req_123
Response 200
{
"status": "processing",
"progress": 0.6,
"result_url": null,
"error": null
}
Idempotency ensures repeated POSTs return the same request_id and response.
Example 2: Signed webhook delivery with retries
POST https://consumer.example.com/webhooks
Headers:
X-Event-Id: evt_789
X-Event-Type: image.processed
X-Signature: t=1712345678,v1=hex(hmac_sha256(secret, timestamp + "." + raw_body))
Body:
{
"event_id":"evt_789",
"type":"image.processed",
"occurred_at":"2026-01-20T12:00:01Z",
"data":{"request_id":"req_123","result_url":"/v1/results/r_456"}
}
Receiver verifies timestamp freshness and HMAC over the raw body. If verification or processing fails, respond 5xx. Producer retries with backoff.
Example 3: Idempotent webhook handler
if (!verifySignature(headers, rawBody, secret)) return 400;
if (isReplayed(headers["X-Event-Id"])) return 200; // Already processed
// Perform durable write BEFORE acknowledging
saveEvent(rawBody);
markProcessed(headers["X-Event-Id"]);
return 200;
This ensures duplicates do not trigger repeated side effects.
Example 4: Failure visibility with dead letter
// Producer retry policy
attempts: up to 10
backoff: exponential with jitter (initial 2s, cap 10m)
stop: after 24h or max attempts
on stop: move payload to DLQ with last error and delivery history
Operators can inspect DLQ and re-drive safely.
How to design an async API and webhooks (step-by-step)
- Define the work unit: inputs, validation rules, and maximum processing time.
- Design the accept endpoint: POST returns 202 + request_id + status_url. Support Idempotency-Key.
- Define a status resource with clear states and terminal outcomes. Include result references.
- Design event schema: event_id, type, occurred_at, data, version.
- Choose signature scheme: HMAC SHA-256 over raw body with timestamp and versioned header.
- Document retry policy and what receiver status codes mean.
- Plan idempotency and deduplication for both producer and consumer.
- Add observability: correlation IDs, metrics, and logs.
- Test with chaos: duplicate events, out-of-order delivery, timeouts, and partial outages.
Recommended status codes
- 202 Accepted: work queued
- 200 OK: status fetch or successful webhook handling
- 409 Conflict: idempotency key reused with different payload
- 410 Gone: webhook endpoint disabled or subscription cancelled
- 429 Too Many Requests: rate limiting; include Retry-After
- 5xx: transient error to trigger producer retry
Exercises
These mirror the exercises below. You can complete them here, then open solutions in the toggles.
Exercise 1 — Design an async image processing API
Goal: Specify endpoints, payloads, status states, and idempotency for an image processing service that resizes images.
- Define the POST to start processing. Include headers, body, and a 202 response shape.
- Define the status endpoint with states and fields.
- Define idempotency behavior for the POST.
- List possible errors and status codes.
Hints
- Return a status URL; do not block the POST.
- Store a hash of request body per Idempotency-Key.
- Include a result_url only when done.
Exercise 2 — Build a safe webhook receiver
Goal: Outline a handler that verifies signatures, deduplicates, and writes durably before acknowledging.
- Describe how you verify HMAC signature over the raw body with a timestamp.
- Show how you detect duplicate
event_idand make the operation idempotent. - State which status code to return on temporary failures vs duplicates.
- Add a retry/backoff policy recommendation for the sender.
Hints
- Store processed event IDs with a TTL.
- Respond 2xx only after the durable step completes.
- Use 5xx to trigger retries on transient errors.
Exercise checklist
- You used 202 + status URL for long work
- Idempotency-Key and conflict detection are defined
- Webhook signature verification uses raw body and timestamp
- Deduplication on event_id with TTL/durable store
- Clear retry policy with backoff and stop conditions
Common mistakes and self-check
- No idempotency on POST or webhooks – Self-check: Can the same request/event be processed twice without harm?
- Verifying signatures over parsed JSON – Self-check: Are you using the raw body bytes?
- Acknowledging before durable write – Self-check: Do you return 2xx only after the critical side effect?
- Infinite retries – Self-check: Do you have max attempts and DLQ?
- Assuming event order – Self-check: Does each event carry enough state to be processed independently?
- Leaking secrets in logs – Self-check: Are headers like Signature redacted?
Practical projects
- Webhook sender and receiver: Build a toy service that emits events and a receiver that validates HMAC, dedups, and stores results.
- Async job pipeline: Accept image jobs, enqueue them, simulate processing, and expose a status endpoint.
- Chaos tests: Randomly duplicate, reorder, and delay events; your receiver should still be correct.
Learning path
- Design the accept-work + status pattern
- Add idempotency keys and conflict rules
- Introduce webhook delivery with signatures and retries
- Harden with observability, DLQ, and versioning
- Explore streaming (SSE/WebSockets) and AsyncAPI specs
Next steps
- Adopt a consistent event schema with versioning and correlation IDs
- Experiment with backoff strategies and jitter
- Evaluate message brokers for scale (e.g., queues and topics)
- Study distributed transaction patterns (outbox, saga)
Mini challenge
You run a document-signing API. A signing operation can take up to 2 minutes. Design the minimal set of endpoints and events so clients can: start signing, check status, and be notified on completion. Include: status states, 202 response, webhook event shape, and failure handling.
Progress and test
The quick test is available to everyone. If you are logged in, your progress will be saved automatically.