What you’ll learn
You will learn how to design structured JSON logs and use correlation IDs to tie together all log entries for a single operation across services, queues, and databases. You’ll implement it as middleware, propagate it to downstream calls, and verify it works end-to-end.
Who this is for
- API Engineers shipping REST/GraphQL/gRPC services
- Developers responsible for incident response and on-call
- Engineers adding observability without heavy vendor tools
Prerequisites
- HTTP basics (requests, headers, status codes)
- Ability to write a simple API service (any language)
- Basic JSON understanding
Why this matters
- Real on-call tasks: find all logs related to a failing request across 3 services, 1 queue worker, and a DB transaction within minutes.
- SLI/SLO investigations: understand latency and errors along a single path.
- Compliance and audit: show what happened for a given customer action without exposing sensitive data.
Concept explained simply
Structured logs are logs written as consistent JSON objects (not free-form text). A correlation ID (also called request ID or trace ID) is a unique value you attach to all logs belonging to the same operation. When a service calls another service, the ID is passed along so every log in the chain carries the same ID.
Mental model
Imagine a shipping label stuck to a package at the first warehouse. No matter how many trucks, warehouses, or workers handle it, the label stays on. The correlation ID is that label for a request. Structured logs are the standardized form you fill out at each step, so machines can search and filter them reliably.
Key design choices
- ID format: Use a UUID v4 or a 16–32 byte random hex string. Keep it lowercase and URL-safe.
- Header name: Accept common incoming headers like 'traceparent' (W3C), 'x-request-id', or 'x-correlation-id'. Prefer to propagate in 'traceparent' if you use W3C trace context; otherwise 'x-request-id'.
- Generation rule: If none present on ingress, generate once at the edge (API Gateway or the first service) and reuse everywhere. Do not regenerate mid-flow.
- Structured JSON schema: At minimum include: timestamp (ISO8601), level, message, correlation_id, service, environment, path/method (if HTTP), duration_ms (for spans), and safe context fields (user_id hash, order_id).
- Redaction: Never log secrets, tokens, or PII in raw form. Hash or drop sensitive fields.
- Volume control: Use appropriate levels (debug/info/warn/error). Sample debug logs in high-traffic paths.
Worked examples
Example 1: Node.js (Express) middleware with JSON logs
// Generate/propagate correlation ID and log per request
const express = require('express');
const { randomUUID } = require('crypto');
const app = express();
function getCorrelationId(req) {
return (
req.headers['x-request-id'] ||
req.headers['x-correlation-id'] ||
(req.headers['traceparent'] ? req.headers['traceparent'].split('-')[1] : null) ||
randomUUID()
);
}
app.use((req, res, next) => {
const cid = getCorrelationId(req);
req.correlationId = cid;
res.setHeader('x-request-id', cid);
const start = Date.now();
res.on('finish', () => {
const log = {
ts: new Date().toISOString(),
level: 'info',
message: 'request_completed',
correlation_id: cid,
service: 'orders-api',
method: req.method,
path: req.path,
status: res.statusCode,
duration_ms: Date.now() - start
};
console.log(JSON.stringify(log));
});
next();
});
app.get('/health', (req, res) => {
console.log(JSON.stringify({
ts: new Date().toISOString(),
level: 'debug',
message: 'health_check',
correlation_id: req.correlationId,
service: 'orders-api'
}));
res.json({ ok: true });
});
app.listen(3000);
Example 2: Python (FastAPI) with contextvar
from fastapi import FastAPI, Request, Response
from uuid import uuid4
from contextvars import ContextVar
import json, time, uvicorn
corr_id: ContextVar[str] = ContextVar('corr_id', default='')
app = FastAPI()
@app.middleware('http')
async def add_correlation_id(request: Request, call_next):
incoming = (
request.headers.get('x-request-id') or
request.headers.get('x-correlation-id')
)
cid = incoming or str(uuid4())
token = corr_id.set(cid)
start = time.time()
response: Response = await call_next(request)
response.headers['x-request-id'] = cid
log = {
'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
'level': 'info',
'message': 'request_completed',
'correlation_id': cid,
'service': 'payments-api',
'method': request.method,
'path': request.url.path,
'status': response.status_code,
'duration_ms': int((time.time() - start) * 1000)
}
print(json.dumps(log))
corr_id.reset(token)
return response
@app.get('/charge')
async def charge():
print(json.dumps({'ts': '...', 'level': 'info', 'message': 'start_charge', 'correlation_id': corr_id.get()}))
return {'ok': True}
# if __name__ == '__main__':
# uvicorn.run(app, host='0.0.0.0', port=8000)
Example 3: Go (net/http) propagation to downstream
package main
import (
"encoding/json"
"log"
"math/rand"
"net/http"
"time"
)
type Log struct {
Ts string `json:"ts"`
Level string `json:"level"`
Message string `json:"message"`
CorrelationID string `json:"correlation_id"`
Service string `json:"service"`
Method string `json:"method"`
Path string `json:"path"`
Status int `json:"status"`
DurationMs int `json:"duration_ms"`
}
func getCID(r *http.Request) string {
if v := r.Header.Get("x-request-id"); v != "" { return v }
if v := r.Header.Get("x-correlation-id"); v != "" { return v }
// Simple random hex for demo
return time.Now().Format("20060102") + "-" + string('a'+rune(rand.Intn(26)))
}
func main() {
http.HandleFunc("/orders", func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
cid := getCID(r)
// Outbound call with propagation
req, _ := http.NewRequest("GET", "http://inventory:8080/stock", nil)
req.Header.Set("x-request-id", cid)
http.DefaultClient.Do(req)
w.WriteHeader(200)
l := Log{Ts: time.Now().UTC().Format(time.RFC3339), Level: "info", Message: "request_completed", CorrelationID: cid, Service: "orders-api", Method: r.Method, Path: r.URL.Path, Status: 200, DurationMs: int(time.Since(start)/time.Millisecond)}
b, _ := json.Marshal(l)
log.Println(string(b))
})
http.ListenAndServe(":8080", nil)
}
Step-by-step implementation
Step 1 — Decide your ID and header
Choose UUID v4 and 'x-request-id' unless you already use W3C 'traceparent'. Document it for your team.
Step 2 — Ingress middleware
On every request: read incoming ID from accepted headers; if missing, generate; attach to request context; add response header with the same ID.
Step 3 — Outbound propagation
When calling other services, copy the ID into the outgoing headers. Use HTTP client interceptors or wrappers.
Step 4 — Structured JSON logger
Create a helper that outputs JSON with consistent keys: ts, level, message, correlation_id, service, environment, and context.
Step 5 — Sensitive data policy
Redact secrets and PII. Hash user identifiers if needed. Make this a code review checkbox.
Step 6 — Verify end-to-end
Trigger a single request and ensure all participating services print logs with the same correlation_id.
Common mistakes and self-checks
- Regenerating IDs mid-flow: Symptom: multiple IDs for one user action. Self-check: grep your logs; count distinct IDs for one request.
- Unstructured text logs: Symptom: hard-to-parse lines. Self-check: can you parse logs as JSON without errors?
- Missing outbound propagation: Symptom: upstream logs have ID, downstream do not. Self-check: inspect outgoing headers in a test call.
- Logging secrets/PII: Symptom: tokens, emails, or full addresses in logs. Self-check: search for keywords like 'Authorization', '@', 'password'. Implement redaction.
- Too chatty logs in hot paths: Symptom: log volume spikes and costs. Self-check: measure logs/request by endpoint; sample debug.
Practical projects
- Build a two-service demo (API -> Worker via queue). Propagate correlation_id from API request to queue message to worker logs.
- Add an HTTP client interceptor library in your stack that auto-injects 'x-request-id' and logs request/response metadata (without bodies).
- Write a small CLI that filters a log file by correlation_id and summarizes timings per hop.
Exercises
Do these now. The Quick Test at the end is available to everyone; if you are logged in, your progress will be saved.
-
Exercise 1: Add correlation ID middleware to your service. Ensure response echoes 'x-request-id'.
Acceptance criteria
- Incoming or generated ID stored in request context
- Every log line includes 'correlation_id'
- Response header 'x-request-id' equals the logged ID
-
Exercise 2: Propagate the ID to a downstream call and confirm both services log the same value.
Acceptance criteria
- Outgoing request sets 'x-request-id'
- Downstream logs include the same 'correlation_id'
- One end-to-end request shows the same ID across both services
Implementation checklist
- [ ] Middleware reads/generates correlation ID
- [ ] Response header echoes the ID
- [ ] Logger outputs consistent JSON schema
- [ ] Outbound HTTP client propagates the ID
- [ ] Redaction policy in place
- [ ] Single request shows same ID across services
Mini challenge
Create a synthetic load test for one endpoint (10 requests). Ensure exactly 10 distinct correlation IDs appear and no request produces more than one ID. Provide a short note on what you fixed if counts do not match.
Learning path
- Start: Structured JSON logging + correlation IDs (this lesson)
- Next: Distributed tracing (trace/span IDs) built on top of correlation
- Then: Metrics for latency/error budgets; log-based alerts
Next steps
- Run the Quick Test below to check your understanding.
- Integrate the middleware into at least one production-like service behind a feature flag.
- Add a log filter in your tooling to search by 'correlation_id'.