Why this matters
API Gateways are the front door of your platform. They protect services, shape traffic, and present a consistent API to clients. As an API Engineer, you will design policies for routing, authentication, authorization, rate limiting, caching, and observability—often under production pressure.
- Real tasks: add a new route without breaking mobile apps; throttle abusive clients; roll out a canary safely.
- Real stakes: outages, security breaches, and latency spikes often start at the edge.
Who this is for
- API Engineers and Backend Developers moving into platform/edge responsibilities.
- DevOps/SREs needing to add policies at the edge.
- Tech Leads who must review gateway architectures.
Prerequisites
- HTTP basics: methods, headers, status codes, TLS.
- JSON and REST fundamentals; basic familiarity with gRPC is helpful.
- Cloud/load balancing basics (L4 vs L7) and environment variable configuration.
Concept explained simply
An API Gateway is a smart traffic controller at Layer 7 (application layer). It receives client requests, decides where to send them, applies rules (auth, rate limits, transformations), observes what happened, and responds.
Mental model
Think of the gateway as a set of filters applied in order:
- Edge checks: TLS, IP allow/deny, bot checks.
- AuthN/AuthZ: validate identity (e.g., JWT), check scopes.
- Traffic shaping: rate limit, quota, concurrency caps.
- Routing: choose the upstream (version, canary %).
- Resilience: timeout, retry, circuit breaker.
- Transformation: rewrite paths/headers; JSON/XML mapping if needed.
- Caching: serve or store responses.
- Observability: logs, metrics, traces; correlation IDs.
Core responsibilities of an API Gateway
- Routing and discovery: map public paths to internal services; support weighted routing for canaries.
- Security: TLS termination, JWT/OAuth validation, mTLS to upstreams if required.
- Traffic control: rate limits (per IP/key), quotas (per period), burst handling.
- Resilience: timeouts, retries with backoff, circuit breakers, fallbacks.
- Transformation: path rewrite, header normalization, sometimes protocol translation (HTTP-gRPC).
- Caching: edge caching for GETs; cache invalidation policies.
- Observability: request logs, structured error bodies, metrics (latency, error rate), tracing headers.
- Versioning and deprecation: route /v1 vs /v2, warn clients, sunset headers.
Architecture choices and patterns
- Centralized gateway vs micro-gateways: one big entry point vs per-domain gateways for autonomy.
- BFF (Backend For Frontend) vs gateway: BFF tailors APIs per client; gateway remains generic edge policy.
- Gateway vs Ingress vs Service Mesh: Ingress exposes services at cluster edge; service mesh handles service-to-service; gateway focuses on client-to-edge features.
- Edge auth: prefer zero-trust (validate every request), short-lived tokens, and least privilege scopes.
Worked examples
Example 1: Route and transform a public path to an internal service
Goal: /api/products -> internal service at http://product-svc:8080 with a path rewrite to /v1/items.
- Match: method=GET, path prefix=/api/products
- Rewrite: /api/products to /v1/items
- Headers: add X-Request-ID if missing; forward Authorization
- Timeout: 2s; Retry: 1 attempt on 502/503 with jittered backoff
Expected effect: clients keep using /api/products while the service can evolve its internal path.
Example 2: Protect an endpoint with JWT and rate limits
Goal: Protect POST /api/orders with JWT validation and tenant-scoped rate limits.
- JWT: verify issuer, audience=orders-api, signature; require scope=orders:create
- Rate limit: key by tenant_id claim; 60 requests/min with burst of 20
- Quota exceeded response: 429 with Retry-After
- Observability: log tenant_id and request_id for audit
Expected effect: abusive tenants cannot starve the system; only authorized scopes can create orders.
Example 3: Canary release with weighted routing
Goal: Send 5% of traffic to orders-v2 while 95% goes to orders-v1.
- Routing weights: v1=95, v2=5
- Stickiness: by user_id header to reduce user experience flapping
- Abort conditions: auto rollback if 5xx rate of v2 exceeds 2x v1 over 5 minutes
Expected effect: measure real traffic safely; quick rollback if error rates spike.
Example 4: Response caching strategy for catalog
Goal: Cache GET /api/catalog for 30s with cache key normalized.
- Cache key: path + sorted query params excluding utm_*
- TTL: 30s; Stale-While-Revalidate: 60s
- Invalidate: purge on product update event (admin action)
Expected effect: reduce latency and backend load while serving fairly fresh data.
Designing effective policies
- Timeouts: set lower than upstream timeouts to fail fast (e.g., gateway 2s, upstream 3s).
- Retries: only on idempotent methods (GET, HEAD); cap attempts; add jitter to backoff.
- Circuit breaker: open when error rate/latency exceeds threshold; provide friendly error body.
- Idempotency keys: for POST endpoints that create resources; deduplicate at gateway or service.
- Header hygiene: remove hop-by-hop headers; standardize correlation IDs (e.g., X-Request-ID).
- Versioning: route /v1 vs /v2; add deprecation headers; plan sunset windows.
Observability from the edge
- Access logs: method, path template (not raw), status, latency, upstream chosen, tenant/user, request_id.
- Metrics: p95/p99 latency by route, error rate by upstream, rate-limit hits, cache hit ratio.
- Tracing: propagate and create spans; inject traceparent; tag with route name and policy outcomes.
- Dashboards: route performance; canary comparison (v1 vs v2) side-by-side.
Security best practices
- Enforce TLS 1.2+; prefer TLS 1.3 where possible.
- Validate JWTs with strict issuer/audience; rotate signing keys; short token TTLs.
- Principle of least privilege: scopes/claims map to minimal permissions.
- Input size limits: body and header size caps; early reject oversized requests.
- Prevent header injection; normalize and whitelist forwarded headers.
- mTLS to upstreams for sensitive domains.
Exercises
Try these hands-on scenarios. Then compare your answers with the solutions below.
Checklist before you start
- You can describe a route, rewrite, and upstream target
- You know how to validate a JWT (issuer, audience, scope)
- You can set a rate limit and choose a key (IP, token, tenant)
- You can explain timeout vs retry vs circuit breaker
Exercise 1: Design a protective edge policy for product creation
Mirror of Exercise ex1 below. Write a route/policy for POST /api/products that:
- Validates JWT with scope products:create and audience products-api
- Limits 30 requests/min per tenant (tenant_id claim), burst 10
- Sets timeout 2s; retry 1 on 502/503; circuit breaker on 20% 5xx over 2 minutes
- Forwards to http://product-svc:8080/v1/products
- Returns JSON error bodies with request_id on failures
Exercise 2: Plan a canary rollout for orders v2
Mirror of Exercise ex2 below. Create a plan to send 10% traffic to orders-v2 with:
- Weighted routing (v1=90, v2=10) with stickiness by user_id
- Abort rule: rollback if v2 p95 latency > v1 p95 by 50% for 10 minutes
- Dashboards/alerts you would track during rollout
Common mistakes and self-check
- Mistake: Retrying non-idempotent POSTs. Self-check: Are you using idempotency keys or restricting retries to GET?
- Mistake: Overbroad rate limits keyed by IP only. Self-check: Should you key by API key or tenant claim?
- Mistake: Missing timeouts. Self-check: Do you have explicit, route-specific timeouts?
- Mistake: Trusting client-sent user identifiers. Self-check: Are identities derived from validated tokens?
- Mistake: Caching private data. Self-check: Are cache controls respecting privacy and auth?
- Mistake: No correlation IDs. Self-check: Can you trace a single request across the stack?
Practical projects
- Build a gateway route pack: implement three routes (catalog, orders, users) with auth, rate limiting, and per-route timeouts; produce a metrics dashboard.
- Canary pipeline: simulate 1%, 5%, 10% rollouts with automatic rollback criteria and a decision checklist.
- Edge cache lab: cache catalog GETs with purge on update events; measure cache hit ratio improvements.
Salary note: API Engineers often earn strong compensation due to reliability and security impact. Varies by country/company; treat as rough ranges.
Learning path
- Master HTTP and TLS basics.
- Learn gateway policies: routing, auth, rate limiting, caching.
- Add resilience: timeouts, retries, circuit breakers.
- Introduce observability: logs, metrics, tracing.
- Practice canary and versioning strategies.
- Automate: templates for routes/policies to ensure consistency.
Next steps
- Do the exercises below and compare with the provided solutions.
- Take the quick test to check understanding. Test is available to everyone; only logged-in users get saved progress.
- Ship a small canary in a sandbox environment and monitor results.
Mini challenge
You must expose /api/recommendations (GET) with strict SLO: p95 < 200ms. Sketch a gateway config that uses caching, a 150ms upstream timeout, and a single retry only if the first attempt took < 50ms. Include headers you would add for observability.
Hint
- Cache for 15s with normalized query parameters.
- Set X-Request-ID; log cache hit/miss; propagate trace headers.
- Guard with circuit breaker on latency spikes.