What you'll learn
You will understand core cache strategies (cache-aside, read-through, write-through, write-behind), TTLs and eviction, invalidation patterns, stampede prevention, and how to choose the right approach for a backend feature.
Why this matters
- Speed up user-facing pages (e.g., product pages, user profiles) without overloading your database.
- Stabilize systems under load by avoiding repeated expensive queries.
- Control data freshness and consistency with predictable rules.
Who this is for and prerequisites
- Who: Backend & Platform developers, especially those touching database access and performance.
- Prerequisites: Know HTTP basics, DB queries (SQL or NoSQL), and simple read/write flow in a web service.
Concept explained simply
A cache is a fast, temporary storage that keeps the results of expensive operations so you don't recompute or refetch every time. You trade memory for speed and design rules to keep data reasonably fresh.
Mental model
Think of a small “hot shelf” near a bakery register. Popular items sit on that shelf (cache) so they can be served instantly. The full bakery (database) has everything but takes longer to reach. You decide:
- What goes on the shelf (keys and values).
- How long items stay (TTL and eviction).
- How to restock or remove items (invalidation strategy).
- How to avoid many people emptying the bakery at once (stampede protection).
Core building blocks
1) Placement: where the cache lives
- In-process (memory of the app): Fastest, limited by instance memory, not shared across servers.
- Distributed store (e.g., Redis/Memcached): Shared, scalable, slightly slower than in-process.
- Edge/CDN: Great for static or cacheable HTTP responses.
2) Keys, values, serialization
- Key naming: Use predictable, namespaced keys like
product:123:v2. - Include variables that affect output: locale, currency, filters.
- Serialization: JSON is simple; ensure stable structure for versioning.
3) TTL and eviction
- TTL: Time-to-live. How long an item stays valid. Add jitter (random +/-) to prevent synchronized expiry.
- Eviction policies: LRU (recently used), LFU (frequently used), FIFO. Choose based on access pattern.
4) Invalidation and consistency
- Cache-aside: App loads on miss; app invalidates or updates when data changes. Eventual consistency.
- Read-through: Cache fetches from DB on miss automatically via a provider.
- Write-through: Writes go to DB and cache together. Stronger consistency on write.
- Write-behind (write-back): Write to cache first; flush to DB later. Low write latency but risk of loss if cache fails.
5) Stampede (thundering herd) protection
- Request coalescing: One worker refreshes; others wait.
- Mutex/locking on keys during recompute.
- Early refresh: Refresh before TTL fully expires.
- Jitter: Randomize TTL to spread reloads.
- Negative caching: Cache known misses briefly to avoid repeated DB hits.
6) When NOT to cache
- Highly volatile, per-request data where correctness must be immediate (e.g., real-time balances in some domains).
- Large values that exceed network/memory budgets.
- Non-idempotent POST responses that should not be reused.
Worked examples
Example 1: Product details page (read-heavy)
- Strategy: Cache-aside with versioned keys and 30–60 min TTL with jitter.
- Key:
product:{id}:v2:locale:{loc}:currency:{cur} - On update: Invalidate affected keys immediately.
// Pseudocode
key = fmt("product:%s:v2:locale:%s:currency:%s", id, loc, cur)
val = cache.get(key)
if (!val) {
lock(key)
val = cache.get(key) // double-check after lock
if (!val) {
val = db.queryProduct(id, loc, cur)
cache.set(key, val, ttl=45m±5m)
}
unlock(key)
}
return val
Example 2: Leaderboard (frequent reads, frequent small writes)
- Strategy: Write-through or write-behind depending on risk tolerance.
- Safer: Write-through (update cache and DB together) so reads are consistent.
- Faster writes: Write-behind with periodic flush; add durable queue to reduce data-loss risk.
// Write-through approach
updateScore(userId, delta) {
newScore = db.increment(userId, delta)
cache.zadd("leaderboard:v1", newScore, userId)
}
Example 3: Auth token introspection (latency sensitive)
- Strategy: Read-through or cache-aside with short TTL (e.g., 1–5 minutes).
- Negative cache invalid tokens briefly (e.g., 30–60 seconds).
// Cache-aside with negative caching
key = "token:v1:" + tokenHash
meta = cache.get(key)
if (!meta) {
meta = authServer.introspect(token)
if (meta.valid) cache.set(key, meta, ttl=3m±30s)
else cache.set(key, {valid:false}, ttl=45s)
}
return meta
Practice exercises
These mirror the graded exercises below. Do them here first, then submit in the Exercises section.
Exercise 1 — Design cache-aside for a product page
Create a brief plan including key format, TTL, invalidation triggers, and stampede protection.
- Key includes product id, locale, currency, and version
- TTL has jitter
- On product update, invalidate related keys
- Use locking or request coalescing on miss
Exercise 2 — Comment counts and trending posts
Pick a strategy and sketch keys and TTLs for:
- Comment count per post (updates often, reads very often)
- Trending posts list (recomputed every few minutes)
- Choose write-through or periodic recompute
- Add short TTL for counts; medium TTL for trending
- Protect against stampede on trending recompute
Common mistakes and self-check
- Mistake: Missing variables in keys (e.g., ignoring locale). Self-check: Do keys change when output would change?
- Mistake: Same TTL for all items. Self-check: Are hot data and cold data treated differently?
- Mistake: No stampede protection. Self-check: What happens when a top key expires under load?
- Mistake: Relying only on TTL without invalidation. Self-check: How fast do updates reflect?
- Mistake: Oversized values. Self-check: Are value sizes logged and bounded?
Practical projects
- Instrument a read-heavy endpoint with cache-aside and measure P50/P95 latency before/after.
- Add per-key mutex and jitter to eliminate a stampede on a popular key.
- Implement write-through for a small leaderboard and compare complexity vs. write-behind.
- Add cache key versioning; perform a zero-downtime schema change by bumping version.
Mini challenge
Your feature: a city weather widget updates every 10 minutes; reads are heavy at the top of the hour. Design:
- Key format, TTL, and jitter
- Stampede protection
- When to pre-warm values
Hint
Use 12–14 minute TTL with +/- 2 min jitter; pre-warm 1–2 minutes before expiry; lock on misses.
Next steps
- Implement one strategy in a demo service and add metrics for hit rate, latency, and error rate.
- Run a load test; tune TTL and eviction policy based on observed patterns.
Learning path
- Before: Query optimization and indexing basics
- This subskill: Caching Strategies Basics
- After: Advanced invalidation, partial caching, tiered caches, CDN strategies
Quick Test
Everyone can take the test. Only logged-in users have progress saved.