Why this matters
Auditing and access logs answer three critical questions: who did what, when, and from where. In Backend & Platform work, they help you:
- Investigate incidents (unauthorized access, data leaks, privilege abuse).
- Prove compliance (e.g., demonstrate controls and accountability).
- Debug safely in production without exposing sensitive data.
- Detect anomalies and trigger alerts based on patterns.
Who this is for
- Backend Engineers implementing secure features and APIs.
- Platform/SRE engineers building logging pipelines.
- Developers preparing for regulated environments (finance, healthcare, enterprise SaaS).
Prerequisites
- Familiarity with HTTP APIs, databases, and identity/auth basics (tokens, sessions, service accounts).
- Comfort with JSON and log aggregation concepts (structured logs).
- Basic understanding of hashing and time synchronization (e.g., NTP).
Concept explained simply
Think of auditing as a tamper-evident diary of security-relevant events: every entry notes who acted, what they did, when, where they came from, and why (if known). Access logs are the broader record of interactions with your system (e.g., every API call). Audit logs are a curated subset focused on sensitive actions (e.g., role changes, data exports, login failures, consent changes).
Mental model
- Camera: Access logs continuously record traffic.
- Notary: Audit logs sign and preserve important security events.
- Map: Correlation IDs connect entries across services for a single request.
What counts as an audit event?
- Authentication attempts and results (success/failure, MFA challenges).
- Authorization decisions (denied/allowed for sensitive resources).
- Privilege changes (role updates, policy edits, API key creation/revocation).
- Data lifecycle events (exports, deletions, retention overrides).
- Configuration changes affecting security (cipher suites, firewall rules, SSO settings).
Key components of trustworthy logs
- Structured format: JSON lines with consistent schema.
- Identity: user_id, subject_type (user/service), auth_method, auth_scope.
- Time: ISO-8601 UTC timestamp; keep clocks in sync.
- Context: request_id, session_id, ip, user_agent, resource, action, outcome.
- Integrity: append-only storage; optional hash-chaining per stream.
- Privacy: redaction of secrets/PII; log references not raw content.
- Retention: keep long enough for investigations/compliance; auto-expire afterward.
- Access control: restrict who can read logs; access to audit logs should itself be audited.
Field naming mini-guide
{
"ts":"2026-01-20T12:34:56.789Z",
"stream":"audit",
"request_id":"1-abc...",
"actor":{ "type":"user", "id":"u_123", "auth_method":"password+mfa" },
"target":{ "type":"role", "id":"admin" },
"action":"role.assign",
"outcome":"success",
"ip":"203.0.113.10",
"user_agent":"Mozilla/5.0 ...",
"reason":"admin_request_456",
"meta":{ "ticket":"INC-42" },
"hash":"...",
"prev_hash":"..."
}
Worked examples
Example 1: Minimal HTTP access log (structured)
{
"ts":"2026-01-20T09:00:02.101Z",
"stream":"access",
"request_id":"req_9f2d",
"method":"POST",
"path":"/v1/payments",
"status":201,
"duration_ms":123,
"ip":"198.51.100.7",
"user_agent":"curl/8.0.1",
"actor":{ "type":"service", "id":"svc_billing" }
}
Tip: avoid logging full request/response bodies. Log sizes, IDs, and outcomes. Redact tokens.
Example 2: Audit event for privilege change with hash chain
// Pseudo-code
prev_hash = read_last_hash()
entry = {
ts: now_utc(), stream: "audit", action: "role.assign",
actor: { type: "user", id: "u_123" },
target: { type: "user", id: "u_987" },
meta: { role: "support" },
outcome: "success"
}
entry.prev_hash = prev_hash
entry.hash = sha256(json_canonical(entry) + prev_hash)
append_json_line(entry)
Anyone re-writing history would break the chain, making tampering detectable.
Example 3: Authentication failure with safe details
{
"ts":"2026-01-20T10:11:22.333Z",
"stream":"audit",
"action":"auth.login",
"actor":{ "type":"user", "id":"u_unknown" },
"outcome":"failure",
"failure_reason":"bad_credentials",
"ip":"203.0.113.77",
"mfa_challenged":false
}
Do not include the provided password or raw token. Use reason codes.
Example 4: Correlation IDs across services
// Gateway assigns request_id and correlation_id
gateway_log = { request_id: "req_1", correlation_id: "corr_A" }
service_A_log = { request_id: "req_1A", correlation_id: "corr_A" }
service_B_log = { request_id: "req_1B", correlation_id: "corr_A" }
// Query by correlation_id to reconstruct the path end-to-end
Checklist: Good vs weak audit log
- Uses consistent JSON schema
- Includes actor, action, target/resource, outcome
- Timestamp in UTC ISO-8601
- Redacts PII/secrets
- Append-only storage with access controls
- Retention and deletion policies defined
- Correlation IDs implemented
Step-by-step: add auditing to a sensitive feature
- List security events: Identify create/update/delete of roles, API keys, exports, login attempts.
- Define schema: actor, action, target, outcome, request_id, ip, reason, meta.
- Instrument: Emit structured events where decisions happen (authz checks, config writes).
- Protect: Redact secrets, cap field sizes, avoid bodies.
- Store: Append-only sink; optionally add hash chain per partition/tenant.
- Alert: Rules for unusual activity (e.g., >5 failed logins from one IP in 10 min).
- Review: Regularly sample and verify integrity; audit who reads audit logs.
Privacy and compliance notes
- Log only what you need for security and operations. Prefer IDs over raw content.
- Tag data classification (public/internal/confidential) and handle accordingly.
- Document retention (e.g., access logs 90 days, audit logs 1–3 years, as required).
- Provide export/remove processes where legally required, consistent with security.
Practical projects
- Build a small service that emits access logs and audit events with correlation IDs.
- Add a hash-chained append-only audit log writer and a verifier tool.
- Create alert rules: burst of 401s, privilege escalation outside business hours.
- Write a retention/redaction policy and implement field-level redaction middleware.
Common mistakes and how to self-check
- Logging secrets: API keys, tokens, or PII in logs. Self-check: search for patterns like "Authorization:" or credit card regexes.
- Unstructured logs: Hard to query. Self-check: can you filter by action=role.assign in one line?
- Missing correlation: No request_id/correlation_id. Self-check: can you trace a request across services quickly?
- Clock drift: Timestamps out of order. Self-check: monitor NTP and compare with trusted time.
- No retention policy: Logs grow uncontrolled or are purged too soon. Self-check: written policy with automated enforcement.
- Open access to logs: Too many can read. Self-check: RBAC for logs and audit of log reads.
Exercises
These mirror the exercises below. The test is available to everyone; logged-in users get saved progress.
Exercise 1: Append-only audit log with integrity
Implement a function that writes audit events to a JSON-lines file with a hash chain. Include fields: ts, stream, actor, action, target, outcome, prev_hash, hash.
- Expected: two consecutive entries where the second uses the first entry's hash as prev_hash.
Hints
- Use SHA-256 over the canonicalized JSON plus prev_hash.
- Sort JSON keys when hashing to keep it deterministic.
Show solution
// Pseudo-solution
function writeAudit(event) {
let prev = readLastLineHash() || "";
event.ts = nowUTC();
event.stream = "audit";
event.prev_hash = prev;
const body = canonicalJSON(event) + prev;
event.hash = sha256(body);
appendLine(JSON.stringify(event));
}
writeAudit({ actor:{type:"user",id:"u_1"}, action:"role.assign", target:{type:"role",id:"analyst"}, outcome:"success" })
writeAudit({ actor:{type:"user",id:"u_2"}, action:"key.create", target:{type:"api_key",id:"k_9"}, outcome:"success" })
Exercise 2: Redaction and retention policy
Draft a short policy for your service. Define which fields must be redacted, retention durations for access vs audit logs, and who can read them.
- Expected: a clear list covering redaction, retention, RBAC, and legal hold.
Hints
- Never log secrets; prefer hashes or last-4 format for identifiers.
- Differentiate access (shorter) and audit (longer) retention.
Show solution
- Redact: passwords, tokens, API keys, session IDs, full addresses; store only IDs and reason codes.
- Access logs: 90 days default retention. Audit logs: 2 years. Auto-expire after term.
- Access: SRE and Security team only; access is ticketed and audited.
- Legal hold: suspend expiration for affected records until hold is lifted.
Exercise checklist
- Hash chain implemented correctly
- No sensitive fields present
- Policy includes retention and RBAC
- Outcome and reason codes present
Mini challenge
Design alert rules for suspicious behavior using your audit and access logs. Propose at least three rules with thresholds and actions.
Example answer
- >=5 failed logins from one IP in 10 minutes: block IP for 15 minutes; notify on-call.
- Privilege change outside 08:00–18:00 local time: page security; require change ticket reference.
- Data export over 1 GB in 1 hour by a new user: require manager approval; open investigation.
Learning path
- Start: instrument structured access logs for all services.
- Add: audit events for auth, authorization decisions, and privilege changes.
- Secure: implement redaction, retention, RBAC for logs, and integrity checks.
- Scale: correlation IDs, sampling for high-volume access logs, dashboards and alerts.
- Review: periodic integrity verification and access reviews.
Next steps
- Implement a unified logging library wrapper to enforce schema and redaction.
- Create a runbook for incident investigation using logs.
- Take the quick test to validate your understanding.