Why this matters
As a Backend Engineer, you will stitch together multiple services: API Gateways calling User or Order services, Order service calling Payment and Inventory, and background workers communicating with Reporting or Search. Choosing HTTP or gRPC and applying reliability patterns directly affects latency, cost, and user experience.
- Real work tasks: design service endpoints, set timeouts and retries, implement idempotency to prevent double charges, secure traffic with TLS/mTLS, and monitor SLAs.
- Common decision: Use JSON/HTTP for broad compatibility or gRPC for fast, strongly-typed RPCs. Get this wrong and you get flaky requests, cascading failures, or slow systems.
Concept explained simply
Inter-service communication is how one microservice asks another to do something or share data. You mainly choose between:
- HTTP/JSON: human-readable, easy to debug, widely supported.
- gRPC (HTTP/2 + Protobuf): compact, fast, schema-first, great for internal service meshes.
Both support timeouts, retries, and security. The real skill is picking the right tool and adding the right guardrails.
Mental model
Think of services like dependable coworkers on a noisy call. You:
- Speak a protocol (HTTP or gRPC).
- Agree on a contract (OpenAPI or Protobuf).
- Set expectations (timeouts, retries, idempotency).
- Authenticate each other (TLS/mTLS, tokens).
- Measure and improve (latency, error rates, budgets).
When to use HTTP vs gRPC
- Choose HTTP/JSON when: public APIs, browser or third-party integrations, simple request/response, easy debugging needed.
- Choose gRPC when: internal microservices, low latency, high throughput, strict schemas, streaming (server/client/bidirectional).
Quick decision checklist
- Consumers include browsers or unknown clients → HTTP
- Only internal services in a controlled stack → gRPC
- Need streaming (events, logs, real-time) → gRPC
- Heavy payloads that benefit from binary encoding → gRPC
- Ad-hoc debugging with curl/postman → HTTP
Key request and reliability patterns
Synchronous vs asynchronous
- Synchronous (HTTP/gRPC): immediate response; simpler but tightly coupled to availability/latency.
- Asynchronous (queues/events): decoupled; eventual consistency; use when not user-facing or not latency-critical.
Timeouts, retries, backoff, idempotency
- Timeouts: set client timeouts slightly below the user-facing SLA. Example: 300 ms per dependent call if overall budget is 1 s.
- Retries: retry only safe/idempotent operations; use exponential backoff with jitter.
- Idempotency: ensure repeating the same request does not duplicate side effects (e.g., billing). Use an Idempotency-Key and dedup on the server.
- Circuit breaker: after repeated failures, fail fast for a short time to protect the system.
Example values
- Client timeout: 300–500 ms
- Retry policy: 2 attempts, backoff 100 ms then 300 ms with ±30% jitter
- Breaker: open after 5 consecutive failures; half-open after 10 s
Contracts and evolution
- HTTP: define with OpenAPI; version in URL or header; add fields without breaking clients.
- gRPC: define .proto; add fields with new tags; avoid reusing tag numbers; maintain backward compatibility.
Security basics
- Encryption: TLS for HTTP, mTLS for gRPC in zero-trust networks.
- AuthZ: short-lived JWT or service tokens; validate on every call.
- Least privilege: limit what each service can call and do.
Worked examples
Example 1: HTTP call with idempotency and retries (Order → Payment)
Goal: Prevent double charges when network retries happen.
POST /v1/charges
Headers:
Content-Type: application/json
Idempotency-Key: 8f1a2c... (unique per logical charge)
Body:
{
"order_id": "o_123",
"amount": 1999,
"currency": "USD"
}
Server behavior:
- If Idempotency-Key seen before with same body → return same result.
- If key seen with different body → 409 Conflict.
- Persist charge result against the key.
Client policy:
- Timeout: 400 ms; Retry: up to 2 with backoff.
- On 5xx or network error → retry.
- On 4xx (except 429) → do not retry.
Example 2: gRPC unary call (Catalog → Pricing)
// pricing.proto
syntax = "proto3";
package pricing.v1;
service PricingService {
rpc GetPrice(GetPriceRequest) returns (GetPriceResponse) {}
}
message GetPriceRequest {
string sku = 1;
string currency = 2; // e.g. "USD"
}
message GetPriceResponse {
int64 amount_minor = 1; // 1299 means $12.99
}
Client policy:
- Deadline: 200 ms.
- Retry on UNAVAILABLE, DEADLINE_EXCEEDED (idempotent read).
- Propagate correlation-id metadata.
Example 3: gRPC server streaming (Logs Aggregator)
Goal: Stream logs efficiently from agents to aggregator.
// logs.proto
service LogStream {
rpc Tail(TailRequest) returns (stream LogEntry) {}
}
Client:
- Sends TailRequest with filters.
- Receives continuous LogEntry messages.
Resilience:
- Use deadlines per stream chunk.
- Reconnect with backoff and resume token if supported.
Hands-on exercises
Complete these in your preferred language. Pseudocode and CLI examples are fine. Your progress is saved if you are logged in; everyone can take the test and exercises.
Exercise checklist
- Set explicit timeouts on all client calls.
- Configure safe retries with exponential backoff and jitter.
- Implement idempotency for write operations.
- Add basic request logging with correlation IDs.
- Return clear error codes/messages for clients.
Exercise 1: Resilient HTTP client for Payments
Build a client that calls a Payment service and charges an order exactly once.
- Create a POST /v1/charges request with JSON body and an Idempotency-Key header.
- Set a 400 ms timeout and up to 2 retries with exponential backoff (100 ms, 300 ms) plus ±30% jitter.
- Retry only on 5xx and network errors; do not retry on 4xx except 429.
- On 429, respect Retry-After before retrying.
- Log correlation_id for every attempt.
Expected output:
- When the server returns a 500 once, your client retries and gets a single successful charge.
- When the same Idempotency-Key is used again, the server returns the same result without double-charging.
- Total latency stays under ~1 s even with retries.
Exercise 2: Define and call a gRPC Pricing service
- Write a pricing.proto with a GetPrice unary RPC as in the example.
- Generate server and client stubs.
- Implement server: return amount_minor for a SKU from an in-memory map.
- Client: call GetPrice with a deadline of 200 ms; retry on UNAVAILABLE up to 2 times.
- Include metadata: correlation-id and auth token; server validates token presence.
Expected output:
- Client receives a valid price within the deadline.
- When you simulate a transient server outage, the client retries and still succeeds.
- Without the token, server returns PERMISSION_DENIED.
Common mistakes and self-check
- No timeouts on client calls → requests hang and tie up threads. Self-check: confirm every client has a deadline/deadline budget.
- Retrying non-idempotent writes → duplicate side effects. Self-check: require Idempotency-Key for writes; verify server dedupes.
- Over-retrying → thundering herd. Self-check: add jitter; cap retries; implement circuit breaker.
- Mismatched contracts → breaking changes. Self-check: backward-compatible changes only; versioning strategy documented.
- Skipping TLS/mTLS in internal networks → exposure risk. Self-check: confirm certificates and policies in dev/stage/prod.
Mini challenge
Design communication for Checkout → Inventory and Checkout → Payment:
- Choose protocol for each and justify in one sentence.
- List timeout, retries, idempotency, and auth decisions.
- Define the two request payloads or proto messages.
Possible approach
- Checkout → Inventory: gRPC unary read; 150 ms deadline; retry on UNAVAILABLE; no idempotency needed.
- Checkout → Payment: HTTP/JSON write; 400 ms timeout; 2 retries on 5xx; Idempotency-Key required; JWT auth; TLS.
Who this is for
- Backend Engineers building or extending microservices.
- Platform/Infra engineers defining service-to-service standards.
- Full-stack engineers integrating with internal services.
Prerequisites
- Comfortable with HTTP basics and JSON.
- Familiar with one backend language (e.g., Go, Java, Python, Node.js).
- Basic understanding of TLS and authentication tokens.
Learning path
- Review HTTP vs gRPC trade-offs and when to choose each.
- Add timeouts, retries, and idempotency to an existing service call.
- Define a small .proto and make a gRPC call with deadlines.
- Secure calls with TLS/mTLS and tokens.
- Measure and tune latency and error budgets.
Practical projects
- Payment-safe Checkout: HTTP with idempotent charges, retries, and circuit breaker.
- Catalog + Pricing (gRPC): schema-first service with deadlines and retry policies.
- Log Tailer (gRPC streaming): server streaming with reconnection and backoff.
Next steps
- Refactor one service boundary to the most suitable protocol.
- Add observability: log correlation IDs and record latency histograms.
- Run chaos tests: inject failures to verify retries and breakers.