Why this matters
In real ML engineering, the model only creates value once it’s serving reliably. Deployment automation lets you ship models and services quickly, safely, and repeatably—without manual steps that break under pressure.
- Push a change to code or model and have it built, tested, and deployed automatically.
- Roll out gradually (canary or blue/green) to reduce risk and roll back fast if metrics degrade.
- Keep environments consistent with containers and infrastructure as code.
- Ship both online inference APIs and batch jobs on predictable schedules.
Concept explained simply
Deployment automation is a conveyor belt from commit to production. Every commit rides the belt through gates: build, test, package, deploy, verify. No hand-holding, just reliable steps.
Mental model
- Source of truth: code + model artifacts (registry) + configs.
- Factory steps: build image, run tests, publish artifact.
- Traffic control: safely route users to new versions.
- Observability: watch health and metrics; auto-stop if needed.
Tip: Map it to your setup
Write down: where your code lives, where Docker images go, how you deploy (e.g., Kubernetes), and what tests block promotion. That’s your conveyor belt.
Key building blocks
- Versioned artifacts: package models with explicit versions. Store them in a registry or as immutable image tags.
- Containers: the same image runs in dev/staging/prod for environment parity.
- Infrastructure as Code: declarative manifests for services, jobs, secrets, and networks.
- CI runner: executes the pipeline (build, test, deploy).
- Deployment targets: cluster or serverless runtime for APIs and batch jobs.
- Rollout strategies: blue/green, canary, or shadow to reduce risk.
- Secrets management: inject keys and configs securely at deploy time.
- Observability gates: smoke tests, health checks, and simple SLO guards to block bad releases.
Worked examples
Example 1 — Auto-deploy an online inference API
Goal: On push to main, build a Docker image, deploy to staging, smoke test, then allow manual approval to production.
Minimal pipeline (generic CI syntax)
# .github/workflows/deploy.yml (example syntax, adapt to your CI)
name: deploy-api
on:
push:
branches: ["main"]
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: |
docker build -t registry.example.com/ml-api:${GITHUB_SHA} .
- name: Unit tests
run: |
pytest -q
- name: Push image
run: |
echo "$REGISTRY_TOKEN" | docker login registry.example.com -u token --password-stdin
docker push registry.example.com/ml-api:${GITHUB_SHA}
deploy-staging:
needs: build-test
runs-on: ubuntu-latest
steps:
- name: Kube auth
run: echo "$KUBE_CONFIG_STAGING" > $HOME/.kube/config
- name: Update image and apply
run: |
kubectl set image deploy/ml-api ml-api=registry.example.com/ml-api:${GITHUB_SHA} -n staging
kubectl rollout status deploy/ml-api -n staging --timeout=120s
- name: Smoke test
run: |
curl -fsS http://staging.example.local/healthz
manual-approve-and-deploy-prod:
needs: deploy-staging
runs-on: ubuntu-latest
steps:
- name: Manual approval gate
run: echo "Approve in CI UI to continue" # Use your CI's approval feature
- name: Kube auth
run: echo "$KUBE_CONFIG_PROD" > $HOME/.kube/config
- name: Deploy prod
run: |
kubectl set image deploy/ml-api ml-api=registry.example.com/ml-api:${GITHUB_SHA} -n prod
kubectl rollout status deploy/ml-api -n prod --timeout=180s
- name: Post-deploy check
run: |
curl -fsS http://api.example.com/healthz
Kubernetes deployment (sketch)
apiVersion: apps/v1
kind: Deployment
metadata:
name: ml-api
namespace: staging
spec:
replicas: 2
selector:
matchLabels: { app: ml-api }
template:
metadata:
labels: { app: ml-api }
spec:
containers:
- name: ml-api
image: registry.example.com/ml-api:TAG
ports: [{ containerPort: 8080 }]
readinessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests: { cpu: "200m", memory: "256Mi" }
limits: { cpu: "1", memory: "512Mi" }
Example 2 — Automate a batch scoring job
Goal: Package a batch job (predict file-to-file) into a container and run nightly.
Cron-style runtime
apiVersion: batch/v1
kind: CronJob
metadata:
name: batch-scoring
namespace: prod
spec:
schedule: "0 2 * * *" # nightly at 02:00
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: scorer
image: registry.example.com/batch-scorer:TAG
args: ["--input", "/data/input.parquet", "--output", "/data/pred.parquet"]
Pipeline step to update CronJob image
kubectl set image cronjob/batch-scoring scorer=registry.example.com/batch-scorer:${GIT_SHA} -n prod
kubectl rollout status cronjob/batch-scoring -n prod --timeout=120s
Example 3 — Blue/Green promotion for low-risk releases
Goal: Deploy v2 alongside v1, switch traffic instantly, and keep v1 ready for rollback.
Two deployments, one service
# v1 deployment has label app: ml-api, version: v1
# v2 deployment has label app: ml-api, version: v2
# Service selects by version label; we switch the selector to promote.
apiVersion: v1
kind: Service
metadata:
name: ml-api
namespace: prod
spec:
selector:
app: ml-api
version: v1 # switch to v2 to promote
ports:
- port: 80
targetPort: 8080
Promotion and rollback
# Promote
kubectl patch svc ml-api -n prod -p '{"spec":{"selector":{"app":"ml-api","version":"v2"}}}'
# Rollback (instant)
kubectl patch svc ml-api -n prod -p '{"spec":{"selector":{"app":"ml-api","version":"v1"}}}'
Step-by-step: build your first automated deployment
- Prepare repo
app/with API or batch code,tests/with unit tests,Dockerfile,k8s/manifests, and.ci/pipeline file. - Containerize
FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . CMD ["uvicorn","app.main:app","--host","0.0.0.0","--port","8080"] - Add tests
def test_predict_shape(): from app.model import predict assert predict([[1,2,3]]).shape == (1,) - Write CI pipeline: build, test, push image; fail fast on test errors.
- Deploy to staging: apply manifests; wait for rollout; smoke test
/healthz. - Manual approval gate: require a human click to deploy to prod.
- Post-deploy checks: hit
/healthz, verify logs, and readiness status. Consider automated rollback if checks fail.
Exercises
These exercises are available for everyone; only logged-in users will have progress saved.
- ex1: Write a CI pipeline that builds an image, deploys to staging, and runs a smoke test. See details below.
- ex2: Implement blue/green in Kubernetes using two deployments and switch traffic by updating the Service selector.
Exercise checklist
- Pipeline builds deterministically with versioned image tags.
- Deploy waits for readiness and runs a smoke test.
- No hard-coded secrets in code or manifests.
- Blue and green are independently scalable.
- Rollback is a single command.
Common mistakes and self-check
- Skipping health checks: Add readiness/liveness probes; verify rollouts wait for readiness.
- Mutable tags: Avoid latest; use commit SHA or a version string.
- Manual config drift: Keep manifests in version control; do not edit live resources by hand.
- Secrets in repo: Inject via CI secrets or your platform’s secret store.
- No smoke tests: A simple GET /healthz would have caught basic issues.
- One-step prod deploys: Use staging + approval, or a controlled rollout.
Self-check mini audit
- Can you redeploy the same commit and get the same result?
- Can you promote/rollback in under 1 minute?
- Is the current production image tag findable in your CI logs?
Practical projects
- API: Containerize a FastAPI model server, deploy with blue/green, add a one-minute rollback script.
- Batch: Nightly feature generation + scoring CronJob with success/failure alerts.
- Shadow traffic: Send a copy of requests to a new model version and compare metrics offline before promotion.
Who this is for
- Machine Learning Engineers owning model serving and reliability.
- Data Scientists promoting models to production with minimal ops.
- MLOps/Platform Engineers building paved roads for teams.
Prerequisites
- Basic containerization (Docker) and command-line familiarity.
- Comfort with writing unit tests and simple HTTP endpoints.
- Understanding of Kubernetes or your chosen deployment runtime.
Learning path
- Start: containerize your app and add health endpoints.
- Add CI: build, test, push images on every commit.
- Add CD: deploy to staging with smoke tests; add manual prod approval.
- Introduce blue/green or canary; practice rollbacks.
- Automate post-deploy checks and metric guards.
Next steps
- Integrate automated rollback triggers on failed smoke tests.
- Add basic SLOs for latency and error rate.
- Automate migration tasks (e.g., feature store backfills) as pre-deploy hooks.
Mini challenge
Pick an existing model service. Implement blue/green and demonstrate: deploy v2, run smoke tests, switch traffic, confirm health, and roll back. Capture the commands you used as a runbook.
Ready? Take the quick test below. The test is available to everyone; log in to save your result.