Why this matters
API Engineers ship changes often. A solid CI/CD pipeline gives you fast feedback, safe rollouts, and quick rollbacks. In real teams you will: run tests and linters on every commit, package and tag API services (often as Docker images), run integration tests with databases or queues, scan for vulnerabilities, and deploy safely to staging/production with clear release strategies.
- Reduce outages: automated checks catch issues before deploy.
- Ship faster: small, safe releases rather than risky big-bang deploys.
- Improve quality: consistent steps ensure the same rules for all changes.
Who this is for
- API Engineers and Backend Developers who ship services regularly.
- DevOps/Platform Engineers who support API teams.
- Students building portfolio APIs with automated pipelines.
Prerequisites
- Comfort with Git (branches, commits, PRs).
- Basic Docker knowledge (build/tag/push images).
- Ability to run API tests locally (unit + integration).
- Basic YAML reading. Optional: Kubernetes basics for advanced deploys.
Concept explained simply
CI/CD is a conveyor belt for your API code: every change hops on, gets checked, cleaned, packaged, inspected for problems, and then delivered to the right environment.
Mental model
- Gate 1: Fast checks (lint, unit tests). Fail fast.
- Gate 2: Build artifact (e.g., Docker image) with immutable tags.
- Gate 3: Deeper checks (integration tests with real services, security scans).
- Gate 4: Deploy to staging, run smoke tests.
- Gate 5: Promote to production with a safe rollout (blue/green or canary) and metrics-based rollback.
Worked examples
Example 1: GitHub Actions for a Node.js API
name: api-ci-cd
on:
push:
branches: [ main ]
pull_request:
jobs:
test-build:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
ports: ["5432:5432"]
options: >-
--health-cmd="pg_isready -U postgres" --health-interval=10s --health-timeout=5s --health-retries=5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 18 }
- name: Install
run: npm ci
- name: Lint
run: npm run lint
- name: Unit tests
run: npm test -- --ci
- name: Build Docker image
run: |
docker build -t ghcr.io/org/api:${{ github.sha }} .
- name: Trivy scan (container)
uses: aquasecurity/trivy-action@master
with:
image-ref: ghcr.io/org/api:${{ github.sha }}
- name: Login to GHCR
run: echo $CR_PAT | docker login ghcr.io -u $GITHUB_ACTOR --password-stdin
env:
CR_PAT: ${{ secrets.GHCR_TOKEN }}
- name: Push image
run: |
docker tag ghcr.io/org/api:${{ github.sha }} ghcr.io/org/api:build-${{ github.run_number }}
docker push ghcr.io/org/api --all-tags
deploy-staging:
needs: test-build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- name: Deploy (Kubernetes)
run: |
kubectl set image deployment/api api=ghcr.io/org/api:${{ github.sha }} --namespace=staging
kubectl rollout status deployment/api -n staging --timeout=120s
- name: Smoke test
run: curl -fsS https://staging.api.example/healthzHighlights: services for Postgres, immutable image tags, basic security scan, staging deploy with smoke test.
Example 2: GitLab CI for a Python FastAPI service
stages: [test, build, scan, deploy]
variables:
DOCKER_DRIVER: overlay2
image: docker:24
services:
- docker:24-dind
pytest:
stage: test
image: python:3.11
script:
- pip install -r requirements.txt
- flake8 .
- pytest -q
build:
stage: build
script:
- docker build -t registry.example.com/team/fastapi:$CI_COMMIT_SHA .
- docker push registry.example.com/team/fastapi:$CI_COMMIT_SHA
only: ["main", "merge_requests"]
scan:
stage: scan
image: aquasec/trivy:latest
script:
- trivy image --exit-code 1 registry.example.com/team/fastapi:$CI_COMMIT_SHA
deploy_staging:
stage: deploy
script:
- kubectl set image deploy/fastapi fastapi=registry.example.com/team/fastapi:$CI_COMMIT_SHA -n staging
- kubectl rollout status deploy/fastapi -n staging --timeout=120s
when: manual
only: ["main"]Highlights: separate stages, Docker-in-Docker build, image scanning, manual staging gate.
Example 3: Jenkinsfile for a Java Spring Boot API
pipeline {
agent any
environment {
REGISTRY = 'registry.example.com/team/springapi'
IMAGE_TAG = "${env.GIT_COMMIT}"
}
stages {
stage('Checkout') { steps { checkout scm } }
stage('Build & Test') { steps { sh 'mvn -B -DskipTests=false clean verify' } }
stage('Build Image') { steps { sh 'docker build -t $REGISTRY:$IMAGE_TAG .' } }
stage('Security Scan') { steps { sh 'trivy image --exit-code 1 $REGISTRY:$IMAGE_TAG' } }
stage('Push') { steps { sh 'docker push $REGISTRY:$IMAGE_TAG' } }
stage('Deploy Staging') {
steps {
sh 'kubectl set image deploy/springapi springapi=$REGISTRY:$IMAGE_TAG -n staging'
sh 'kubectl rollout status deploy/springapi -n staging --timeout=120s'
}
}
}
post {
always { junit 'target/surefire-reports/*.xml' }
}
}Highlights: Maven tests, JUnit reports, container build, scan, deploy, rollout wait.
Hands-on: Build a baseline CI/CD
- Step 1 — Fast checks
- Run linter and unit tests.
- Fail fast within 3–5 minutes.
- Step 2 — Build & tag
- Build a Docker image.
- Tag with commit SHA and a human-friendly tag (like build number).
- Step 3 — Security & quality
- Dependency audit (e.g., npm audit, pip-audit, OWASP Dependency-Check).
- Container scan (e.g., Trivy).
- Step 4 — Integration tests
- Run tests against real services (DB, cache, message broker) using service containers or ephemeral test envs.
- Step 5 — Deploy to staging
- Apply manifests or Helm chart; wait for rollout success.
- Run smoke tests against /health and a basic API endpoint.
- Step 6 — Promote to production
- Manual approval gate or automated canary with metrics guardrails.
- Enable instant rollback (record previous version).
Sample environment variables to standardize
- SERVICE_NAME, SERVICE_VERSION
- REGISTRY_URL, IMAGE_TAG
- ENV (dev/staging/prod)
- DB_URL, REDIS_URL (never hardcode secrets)
- HEALTHCHECK_URL
Security, quality, and database migrations
- Shift-left security: run SAST and dependency scanning in CI.
- Container scan: fail the pipeline on high severity vulnerabilities when possible.
- Secrets: store in CI secret storage; inject via environment variables or secret mounts.
- DB migrations:
- Apply backward-compatible changes first (expand), deploy new code, then remove old fields later (contract).
- Run migrations as a separate job before traffic switching; log and back up.
Safe migration checklist
- Fields added as nullable or with defaults.
- Long-running migrations batched.
- No destructive changes in the same deploy as code that depends on them.
- Rollback plan: how to revert schema safely.
Release strategies and rollback
- Blue/Green: two identical environments. Switch traffic when green is healthy. Rollback by switching back.
- Canary: release to a small percentage first, monitor errors and latency, then ramp up.
- Feature flags: decouple deploy from release. Toggle features on gradually.
- Rollback: store last known-good version and a one-command rollback (kubectl rollout undo or redeploy previous tag).
Simple canary plan
- Deploy new version as canary with 5% traffic.
- Monitor error rate, p95 latency, and CPU for 10–15 minutes.
- If stable, increase to 50%, then 100%.
- If metrics degrade, rollback immediately.
Exercises
Do these in order. A matching solution is included below each exercise.
Exercise 1 — Minimal CI for an API
Goal: create a minimal pipeline that runs lint, tests, builds a Docker image, and pushes it to a registry with an immutable tag.
- Use a commit SHA tag.
- Fail the pipeline if tests fail.
- Keep total CI time under ~8 minutes if possible.
Show solution
See the Exercises section below for a full GitHub Actions example (ex1).
Exercise 2 — Add integration tests with a database
Goal: spin up a database service in CI and run integration tests against it.
- Use a health check to wait for DB readiness.
- Seed minimal test data.
- Run tests in parallel to keep CI fast.
Show solution
See the Exercises section below for a GitHub Actions example with a Postgres service (ex2).
Exercise 3 — Safe staging deploy with smoke test
Goal: deploy to staging, wait for rollout, then hit /healthz and a sample API endpoint.
- Record the deployed image tag.
- Fail if smoke test fails or if rollout times out.
Show solution
See the Exercises section below for a staging deployment example (ex3).
Self-check checklist
- Lint and unit tests run on every push and PR.
- Images are tagged immutably (commit SHA) and pushed before deploy.
- Integration tests use real services (DB/cache/queue) and wait for readiness.
- Security scans run and fail on high severity issues.
- Staging deploy waits for rollout and runs smoke tests.
- Production deploy is gated (manual approval or canary) with a rollback plan.
Common mistakes
- Skipping fast feedback: not running lint/tests on PRs. Fix: add a PR workflow.
- Mutable tags like "latest" only. Fix: always include commit SHA.
- No rollout wait. Fix: use rollout status/health checks and timeouts.
- Unscanned images. Fix: container scan in CI with a failure threshold.
- Mixing destructive DB changes with deploy. Fix: expand/contract migration strategy.
- Secrets in code. Fix: use CI secret storage and environment variables.
Practical projects
- Project A: Pipeline for a Node/Express API with Postgres integration tests and Trivy scan.
- Project B: GitLab CI for a FastAPI service with staged deploy and manual prod promotion.
- Project C: Jenkins pipeline for a Spring Boot API with blue/green deploy and automated smoke tests.
Learning path
- Start: Minimal CI (lint + unit tests) and Docker build.
- Next: Add integration tests with service containers and caching strategies.
- Then: Security scanning (dependencies + container), artifact signing.
- Then: Staging deploy + smoke tests + DB migrations.
- Advanced: Blue/green or canary, metrics-based rollbacks, multi-service orchestration.
Quick Test
Use the Quick Test below to check your understanding. Anyone can take it. Only logged-in users will have their progress saved.
Mini challenge
Design a canary rollout plan for your API. Define:
- Traffic steps (percentages and durations).
- Metrics and thresholds to continue or rollback.
- Exact rollback command and how you verify recovery.
Write it as a runbook your teammate could follow.
Next steps
- Automate image signing and verification in CI.
- Add performance smoke tests (quick p95 latency check) during staging.
- Introduce feature flags for risky changes, controlled by config.