Why this matters
Webhooks and callbacks let services notify each other without polling. As a Backend Engineer, you will:
- Receive external event notifications (payments, shipments, auth events) reliably and securely.
- Send callbacks to clients or partner systems when your long-running jobs finish.
- Design retry, idempotency, and monitoring so events are never lost or duplicated.
- Debug real-world delivery issues (timeouts, 4xx/5xx, replays) under production load.
Who this is for
- Backend Engineers building integrations between internal microservices or third-party systems.
- Developers migrating from synchronous APIs to event-driven patterns.
Prerequisites
- HTTP basics (methods, status codes, headers, timeouts).
- JSON serialization and schema design.
- One server framework in any language (e.g., Node, Python, Go).
- Basic hashing/HMAC knowledge for signing.
Concept explained simply
Webhook: The sender pushes an HTTP request to your endpoint when an event happens. You acknowledge fast (2xx), then process asynchronously.
Callback: You accept a job request that includes a callback URL. When you finish the job, you call back that URL with the result.
Mental model
- Think "postal mail with receipts": sender drops a letter (request), you quickly sign the receipt (2xx), then sort and handle the mail internally (queue/worker). If the sender doesn’t get a receipt, it resends later.
- Every letter has a unique ID (idempotency key) so duplicates don’t create duplicate effects.
- Letters are sealed and stamped (HMAC signatures + timestamps) so you can trust origin and freshness.
Key design choices
- Reliability: 2xx fast, enqueue for later processing, implement retries with exponential backoff and jitter.
- Idempotency and ordering: store a dedupe key and handle out-of-order events safely.
- Security: verify HMAC signatures; check timestamps; use allowlists or mTLS where available.
- Observability: log event IDs, delivery attempts, latencies; emit metrics (success/failure rates).
- Versioning: version payloads; support schema evolution with additive changes.
Worked examples
Example 1: Webhook receiver with HMAC verification (Node.js)
Show code
// Concept: verify signature, ack fast, enqueue later
const express = require('express');
const crypto = require('crypto');
const app = express();
// Capture raw body for signature verification
app.use(express.json({ verify: (req, res, buf) => { req.rawBody = buf } }));
const SECRET = process.env.WEBHOOK_SECRET || 'change-me';
function verifySignature(raw, signature, timestamp) {
if (!signature || !timestamp) return false;
// Prevent replay: reject old timestamps (e.g., > 5 minutes)
const now = Math.floor(Date.now() / 1000);
if (Math.abs(now - Number(timestamp)) > 300) return false;
const payload = `${timestamp}.${raw}`;
const expected = crypto.createHmac('sha256', SECRET).update(payload).digest('hex');
return crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(signature));
}
app.post('/webhooks/orders', (req, res) => {
const sig = req.header('X-Signature');
const ts = req.header('X-Timestamp');
const ok = verifySignature(req.rawBody, sig, ts);
if (!ok) return res.status(400).json({ error: 'invalid signature' });
const eventId = req.header('Idempotency-Key') || `evt_${Date.now()}`;
// Deduplicate: insert eventId into a store; skip if already processed
// Enqueue for async processing
console.log('Accepted event', { eventId, type: req.body.type });
// Acknowledge fast
return res.status(202).json({ status: 'accepted', eventId });
});
app.listen(3000, () => console.log('Receiver listening on 3000'));- Why it works: signature and timestamp block tampering/replays; 202 Accepted signals async processing; idempotency key prevents duplicates.
Example 2: Webhook sender with retries and idempotency (Python)
Show code
# Concept: exponential backoff + jitter, include Idempotency-Key
import time, hmac, hashlib, json, os, random
import urllib.request
SECRET = os.getenv('WEBHOOK_SECRET', 'change-me')
def sign(body_bytes, timestamp):
payload = f"{timestamp}.".encode() + body_bytes
return hmac.new(SECRET.encode(), payload, hashlib.sha256).hexdigest()
def post_with_retries(url, body):
body_bytes = json.dumps(body).encode()
idem = body.get('idempotency_key', f"evt_{int(time.time()*1000)}")
t = str(int(time.time()))
sig = sign(body_bytes, t)
base = 0.5
max_attempts = 5
for attempt in range(1, max_attempts + 1):
try:
req = urllib.request.Request(url, data=body_bytes, method='POST')
req.add_header('Content-Type', 'application/json')
req.add_header('X-Timestamp', t)
req.add_header('X-Signature', sig)
req.add_header('Idempotency-Key', idem)
with urllib.request.urlopen(req, timeout=5) as resp:
code = resp.getcode()
if 200 <= code < 300:
print('Delivered', code)
return True
elif 400 <= code < 500:
print('Permanent failure', code)
return False
else:
raise Exception(f"Retryable status {code}")
except Exception as e:
sleep_s = base * (2 ** (attempt - 1)) + random.random()
print(f"Attempt {attempt} failed: {e}. Retrying in {sleep_s:.2f}s")
time.sleep(min(sleep_s, 10))
return False
# Example usage
post_with_retries('http://localhost:3000/webhooks/orders', {
'idempotency_key': 'evt-12345',
'type': 'order.created',
'data': {'order_id': 42}
})- Why it works: backoff controls load; jitter reduces thundering herd; 4xx stops retries; 5xx/timeouts retry.
Example 3: Callback pattern for long-running jobs (Go)
Show code
// Concept: accept a task and later POST to provided callback URL
package main
import (
"encoding/json"
"log"
"math/rand"
"net/http"
"time"
)
type JobRequest struct {
Input string `json:"input"`
CallbackURL string `json:"callback_url"`
}
type JobResult struct {
Status string `json:"status"`
Output string `json:"output"`
}
func main() {
http.HandleFunc("/jobs", func(w http.ResponseWriter, r *http.Request) {
var req JobRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil || req.CallbackURL == "" {
http.Error(w, "bad request", http.StatusBadRequest)
return
}
go func(callback string, input string) {
// Simulate work
time.Sleep(2 * time.Second)
result := JobResult{Status: "done", Output: input + "_processed"}
body, _ := json.Marshal(result)
// Best practice: include idempotency key and signature headers here
http.Post(callback, "application/json", //nolint
http.NoBody) // Replace with bytes.NewBuffer(body) in real code
_ = body
}(req.CallbackURL, req.Input)
w.WriteHeader(http.StatusAccepted)
w.Write([]byte(`{"status":"queued"}`))
})
log.Println("Job API on :8080")
http.ListenAndServe(":8080", nil)
}
- Why it works: the client doesn’t wait for completion; you callback with results. Add signatures/idempotency for production.
Security and reliability checklist
- Verify signature and timestamp for every request.
- Respond quickly (2xx/202) and process asynchronously.
- Use idempotency keys to deduplicate.
- Implement retries with exponential backoff + jitter.
- Reject old timestamps to prevent replay attacks.
- Log event IDs, attempt counts, and outcomes.
- Use queues or durable storage to avoid losing events.
- Prefer additive schema changes; version payloads when needed.
Common mistakes and self-check
- Doing heavy work before responding 2xx. Self-check: can your handler return in <100ms under load?
- Skipping signature verification. Self-check: try altering the payload—does your server reject it?
- No idempotency. Self-check: replay the same event—does it create duplicates?
- Unbounded retries. Self-check: simulate persistent 400—do you stop retrying?
- Ignoring timeouts. Self-check: does your sender limit request time and log timeouts?
Learning path
- Before: HTTP/REST, JSON schemas, authentication basics.
- Now: Webhooks and callbacks (this lesson).
- Next: Message queues, event buses, outbox pattern, circuit breakers.
Hands-on steps
- Create a /webhooks endpoint that returns 202 and logs the event ID.
- Add HMAC verification with timestamp and constant-time compare.
- Store idempotency keys in a persistent store to dedupe.
- Move processing to a background worker.
- Add metrics: deliveries, failures, average latency.
Exercises
Do these in order. Compare your answers with the solutions below each exercise.
Exercise 1 — Build a secure webhook receiver
- Create POST /webhooks/orders.
- Verify X-Signature (HMAC-SHA256 over "timestamp.rawBody") and X-Timestamp (±5 minutes).
- If valid, return 202 with {"status":"accepted"} and log Idempotency-Key.
- Do not perform heavy work in the request thread.
Hints
- Read raw body for signature verification.
- Use constant-time comparison for signatures.
- Use a queue or an in-memory channel to simulate async processing.
Show solution
// Pseudocode outline
on POST /webhooks/orders:
raw = getRawBody()
ts = header('X-Timestamp')
sig = header('X-Signature')
assert abs(now - ts) <= 300
expected = HMAC_SHA256(secret, ts + '.' + raw)
assert timingSafeEqual(expected, sig)
eventId = header('Idempotency-Key')
log('accepted', eventId)
enqueue(raw)
return 202, { status: 'accepted', eventId }
Exercise 2 — Implement a resilient webhook sender
- Write a script that POSTs JSON to /webhooks/orders with Idempotency-Key, X-Timestamp, X-Signature.
- Retry on timeouts and 5xx with exponential backoff + jitter (max 5 attempts).
- Stop on 2xx or any 4xx.
Hints
- Base delay 0.5s; multiply by 2 each retry, add random(0-1)s jitter.
- Log attempt number and result for visibility.
Show solution
# Python-like pseudocode
for attempt in 1..5:
try POST
if 2xx: done
if 4xx: abort
else: raise
catch:
sleep = base * 2^(attempt-1) + rand(0..1)
wait(sleep)
Self-checklist
- Receiver verifies signatures and timestamps.
- Receiver responds < 100ms and enqueues work.
- Sender retries with backoff and jitter.
- Idempotency keys are stored and deduplicated.
- Logs include event IDs and outcomes.
Practical projects
- Delivery tracker: Receiver accepts delivery updates; dashboard shows latest event per package with dedup and retry counts.
- Image processing job: Accept uploads, return 202, callback with processed URL. Include signature verification on callback.
- Audit pipeline: Receive business events, validate schema, write to append-only store; expose replay and dead-letter viewing.
Next steps
- Add dead-letter queues for events that fail after max retries.
- Introduce the outbox pattern to ensure atomic publish with database changes.
- Expose a retry endpoint to re-drive specific event IDs safely.
Mini challenge
Simulate a burst of 100 events where 20% randomly fail with 5xx. Instrument your sender to achieve at least 98% final delivery with backoff+jitter and no duplicates on the receiver. Tip: use idempotency and logging to verify.
Quick Test
Everyone can take the test for free. Only logged-in users will have their progress saved.