Why this matters
Mental model
Think of deployments like changing the tires on a moving car using lanes:
- Rolling: You change one wheel at a time while the car keeps moving.
- Blue-Green: You prepare a second car in a parallel lane and switch lanes instantly.
- Canary: You test the new lane with a few cars first before sending everyone.
- Feature flags: You deploy the code but keep new features behind switches.
Deep dive: Deployment vs Release
Deploy = put code in production. Release = expose functionality to users. Feature flags separate these so you can deploy anytime and release when ready.
Deep dive: Rollback vs Roll forward
- Rollback: revert to the previous known-good version.
- Roll forward: fix quickly and deploy a new version. Use when the fix is trivial and safer than reverting.
Common deployment strategies
- Recreate: Stop old version, start new. Simple but downtime.
- Rolling: Update a few instances at a time. Minimal disruption.
- Blue-Green: Two identical environments (blue and green). Switch traffic; instant rollback by switching back.
- Canary: Send a small percent of traffic to new version. If healthy, increase; else roll back.
- Feature flags: Deploy dark, toggle on gradually.
Safety checks before and after deploy
- Health probes: readinessProbe prevents traffic to unready pods; livenessProbe restarts stuck pods.
- Automated smoke tests after deploy.
- Metrics gates: error rate, latency, CPU/memory, saturation.
- Database strategy: backward-compatible migrations (expand-then-contract) and ability to roll back or roll forward safely.
- Artifact immutability: each build is versioned and reproducible.
Safe deployment checklist
- Versioned and immutable build artifact created by CI.
- DB migrations are backward-compatible for one release window.
- Health probes set; logs and dashboards ready.
- Canary or blue-green path defined.
- Rollback plan documented and tested.
Worked examples
Example 1: Blue-Green with a simple switch
- Provision two identical stacks: Blue (current) and Green (new).
- Deploy version 2.0.0 to Green.
- Run smoke tests against Green (health endpoints, key API calls).
- Switch traffic from Blue to Green in the load balancer.
- Monitor error rate and latency for 10–15 minutes.
- If issues arise, switch back to Blue instantly.
What this looks like in config
# Pseudocode for a traffic switch step
switch_traffic(target_env: "green")
verify(duration: "15m", metrics: ["5xx_rate", "p95_latency"])
if unhealthy: switch_traffic(target_env: "blue")
Example 2: Kubernetes rolling update and rollback
- Apply Deployment with readiness and liveness probes.
- Update container image to a new version.
- Watch rollout and metrics; undo if needed.
kubectl apply -f deployment.yaml
kubectl set image deploy/myapp api=myrepo/myapp:2.0.0 --record
kubectl rollout status deploy/myapp --timeout=5m
# if alerts or errors spike
kubectl rollout undo deploy/myapp
kubectl rollout history deploy/myapp
Sample snippet with probes
containers:
- name: api
image: myrepo/myapp:2.0.0
readinessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
httpGet: { path: /livez, port: 8080 }
initialDelaySeconds: 15
periodSeconds: 10
Example 3: CI pipeline with canary gate
- Build artifact and push image.
- Deploy to staging; run tests.
- Deploy to prod-canary (10% traffic) and run checks.
- If healthy, ramp to 50% then 100%.
- If unhealthy at any stage, roll back and alert.
Minimal CI fragment
jobs:
deploy:
steps:
- name: Build and push
run: |
docker build -t myrepo/myapp:${GIT_SHA} .
docker push myrepo/myapp:${GIT_SHA}
- name: Deploy canary (10%)
run: ./scripts/deploy_canary.sh myrepo/myapp:${GIT_SHA} 10
- name: Smoke + metrics gate
run: ./scripts/gate.sh --error-rate-threshold 1 --latency-p95-ms 300
- name: Ramp to 50% then 100%
run: ./scripts/ramp.sh 50 && ./scripts/ramp.sh 100
- name: Rollback if gate fails
if: failure()
run: ./scripts/rollback.sh
Rollbacks that actually work
- Keep the previous version warm or quickly deployable.
- Use versioned DB schema with backward-compatible changes. Avoid destructive changes until old version is fully retired.
- Automate the undo path: one command or playbook with steps and owners.
- Verify post-rollback: health endpoints, dashboards, and user flows.
Expand-Contract DB example
- Expand: add new column nullable; write both old and new paths.
- Deploy app that writes both columns; read from old.
- Migrate data in background.
- Deploy app that reads new column.
- Contract: drop old column in a later release.
Exercises
Do these hands-on tasks. Then open the solution toggles to compare.
Checklist: Did you cover this?
- Artifact versions and where to fetch previous version.
- Traffic switch or rollout undo command.
- DB backward compatibility or mitigation.
- Monitoring signals and thresholds to trigger rollback.
- Who to page and how to record the incident.
Common mistakes and self-check
- No readiness probe: new pods get traffic before they are ready. Self-check: Does traffic only hit pods after passing readiness?
- Destructive DB migrations with no fallback. Self-check: Can the previous app version run with the new schema?
- Unversioned artifacts: you can’t reproduce the last good build. Self-check: Can you redeploy the exact prior version by ID?
- Skipping canary/metrics: blind deploys. Self-check: What metric would force you to stop? Is it automated?
- Manual, unclear rollback steps. Self-check: Can a new on-call engineer execute the rollback in under 5 minutes?
Practical projects
- Convert a recreate deployment to rolling with probes and verify zero-downtime.
- Implement blue-green for a small service; script the traffic switch and rollback.
- Add feature flags to separate deploy from release for one endpoint.
- Create a one-page rollback playbook and run a tabletop drill.
Who this is for
- Backend Engineers shipping services to staging and production.
- DevOps-minded developers improving reliability and release speed.
Prerequisites
- Basic Git and CI familiarity.
- Containerization basics (Docker images, tags).
- Optional: Kubernetes basics (Deployments, Services, probes).
Learning path
- Start with rolling deployments and health probes.
- Add canary or blue-green with a simple traffic switch.
- Introduce feature flags to decouple deploy and release.
- Automate gates based on metrics and smoke tests.
- Practice rollback and document the playbook.
Next steps
- Automate your current deploy with canary and a rollback script.
- Schedule a monthly rollback drill.
- Add DB expand-contract patterns to your contribution guide.
Mini challenge
You deployed 2.1.0 and error rate rose from 0.3% to 3% within 2 minutes. Outline, in 5 short steps, what you do in the next 10 minutes. Include whether you roll back or roll forward and why.
Quick Test
Take the quick test to check your understanding. Everyone can take it; logged-in users get saved progress.