Why this matters
Mental model
Think of CI as a factory conveyor belt:
- Trigger: A change enters the belt.
- Build: Prepare tools and dependencies.
- Test stages: Quick checks first (smoke), deeper checks next (integration/contract), optional stress (performance).
- Artifacts: Box up results (reports, logs) for inspection.
- Gate: If any critical station fails, the belt stops.
Core building blocks
- Triggers: push, pull_request/merge_request, tags/releases, and schedules.
- Isolation: ephemeral runners/containers, clean workspace.
- Secrets: injected as environment variables, never printed.
- Caching: speeds up installs (e.g., npm/pip caches).
- Parallelization: shard tests across executors to reduce time.
- Artifacts: JUnit/HTML reports, coverage, logs.
- Selective runs: only run API tests when relevant files change.
- Quality gates: fail on test failures, low coverage, or failed contracts.
Worked examples
Example 1 — GitHub Actions: pytest API tests with parallel matrix
Workflow file: .github/workflows/api-tests.yml
name: API Tests
on:
pull_request:
paths:
- "api/**"
- ".github/workflows/api-tests.yml"
push:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11"]
shard: ["1/2", "2/2"]
env:
PYTHONDONTWRITEBYTECODE: 1
API_BASE_URL: http://localhost:8080
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Cache pip
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: pip-${{ runner.os }}-${{ matrix.python-version }}-${{ hashFiles('**/requirements*.txt') }}
- name: Install deps
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Start API (compose)
run: |
docker compose -f docker-compose.ci.yml up -d --build api
- name: Wait for API
run: |
for i in {1..30}; do
if curl -fsS "$API_BASE_URL/health"; then exit 0; fi; sleep 2;
done
echo "API did not become healthy"; exit 1
- name: Run tests (sharded)
run: |
pytest -m "not slow" \
--junitxml=reports/junit-${{ matrix.shard }}.xml \
--cov=src --cov-report=xml:reports/coverage-${{ matrix.shard }}.xml \
--maxfail=1 \
--shard-id ${{ matrix.shard }}
- name: Upload reports
if: always()
uses: actions/upload-artifact@v4
with:
name: api-test-reports-${{ matrix.python-version }}-${{ matrix.shard }}
path: reports/**
Notes:
- Matrix splits tests by Python version and shards to speed up runs.
- Compose starts a local API; the health check ensures readiness.
- JUnit and coverage are saved as artifacts for inspection.
Example 2 — GitLab CI: Jest + supertest with service DB
Pipeline file: .gitlab-ci.yml
image: node:20
stages: [install, test]
variables:
NODE_ENV: test
API_BASE_URL: http://localhost:8080
services:
- name: postgres:15
alias: db
# Your app will connect to 'db' as host
cache:
key: npm-${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
install:
stage: install
script:
- npm ci
artifacts:
paths:
- node_modules/
api_tests:
stage: test
needs: [install]
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
changes:
- api/**
- package*.json
- .gitlab-ci.yml
- if: '$CI_COMMIT_BRANCH == "main"'
before_script:
- npm run db:migrate
- npm run start:ci &
- npx wait-on $API_BASE_URL/health
script:
- npx jest --ci --runInBand --reporters=default --reporters=jest-junit
variables:
JEST_JUNIT_OUTPUT_DIR: reports
JEST_JUNIT_OUTPUT_NAME: junit.xml
artifacts:
when: always
paths:
- reports/
- logs/
Notes:
- Runs on merge requests only when API-related files change; always on main.
- Postgres runs as a service container; app starts in background.
- Jest outputs JUnit for CI to display.
Example 3 — Jenkins: Parallel smoke + contract using Newman and Node
Jenkinsfile
pipeline {
agent any
options { timestamps() }
stages {
stage('Checkout') {
steps { checkout scm }
}
stage('Install') {
steps {
sh 'npm ci'
sh 'npm install -g newman'
}
}
stage('Start API') {
steps {
sh 'docker compose -f docker-compose.ci.yml up -d --build api'
sh 'for i in $(seq 1 30); do curl -fsS http://localhost:8080/health && exit 0 || sleep 2; done; exit 1'
}
}
stage('Tests') {
parallel {
stage('Smoke') {
steps {
sh 'newman run postman/collection.json -e postman/env.ci.json --reporters junit --reporter-junit-export reports/newman-smoke.xml'
}
}
stage('Contract') {
steps {
sh 'npm run test:contract -- --reporter=junit --reporter-options output=reports/contract.xml'
}
}
}
}
}
post {
always {
junit 'reports/**/*.xml'
archiveArtifacts artifacts: 'reports/**, logs/**', fingerprint: true
sh 'docker compose -f docker-compose.ci.yml down -v || true'
}
}
}
Notes:
- Parallel stages shorten feedback time.
- All reports are published even if a stage fails.
- Compose is shut down in post to clean resources.
Designing a robust workflow
- Stage order: lint → unit → build → deploy ephemeral env → smoke → integration → contract → (optional) performance.
- Fast fail: run quick smoke tests early on pull/merge requests.
- Selective runs: use path filters to avoid running API tests on docs-only changes.
- Parallelization: shard test suites by tags or file patterns.
- Quarantine flaky tests: run @flaky in a non-blocking job while you fix them.
- Quality gates: minimum coverage (e.g., 80%), contract verification must pass, error budgets for performance (p95 latency).
Environments, data, and secrets
- Secrets: store tokens/keys in CI secret storage; reference via environment variables; avoid echoing values in logs.
- Ephemeral test data: use factories/fixtures; clean up after tests or use disposable databases.
- Configuration: pass base URLs and credentials via env; never hardcode.
- Health checks: gate test start on service readiness.
Observability and artifacts
- JUnit XML: lets CI show per-test results inline.
- Coverage reports: publish XML/HTML; enforce thresholds.
- Logs and traces: archive app logs for failed runs; keep for a reasonable retention period.
- Flake tracking: tag flaky tests; record failure rates to prioritize fixes.
Exercises
Complete the tasks below, then compare with the solutions. The Quick Test at the end is available to everyone; only logged-in users get saved progress.
- Exercise 1: Add a GitHub Actions workflow that runs smoke tests on pull requests only when files under api/ change. Save JUnit as artifact and fail if coverage < 80%.
- Exercise 2: Introduce a flaky test quarantine. Critical tests must block merges; flaky tests run in a separate, non-blocking job but still publish results.
Checklist before you move on
- Pipeline triggers correctly on PRs and main branch.
- API waits for readiness before tests.
- Reports (JUnit, coverage) appear as artifacts.
- Selective runs work (no run on non-API changes).
- Coverage gate and contract tests block merges when failing.
- Flaky tests are isolated from critical signal.
Common mistakes and self-check
- Running tests before the API is ready. Self-check: add a health endpoint wait loop.
- Printing secrets in logs. Self-check: search logs for keys/tokens; mask variables.
- Huge runtimes due to no caching. Self-check: verify cache hit rates between runs.
- No artifacts on failure. Self-check: ensure artifacts upload with if: always or post { always }.
- Flaky tests blocking merges. Self-check: tag and quarantine; monitor separately.
- Unreliable environment. Self-check: use clean containers and deterministic seed data.
Practical projects
- Project 1: Convert local API tests to run on PRs with GitHub Actions. Acceptance: JUnit and coverage artifacts for each PR; run time under 10 minutes.
- Project 2: Add contract testing stage. Acceptance: contract failures block merges; reports are archived.
- Project 3: Introduce test sharding and caching. Acceptance: pipeline duration reduced by at least 40% vs. baseline.
Learning path
- Start: add smoke tests to PRs.
- Then: add integration tests with service containers.
- Next: publish artifacts and enforce coverage gates.
- Then: parallelize and add selective runs.
- Advanced: contract and performance stages with budgets.
Next steps
- Implement the exercises in a sandbox repo.
- Add path-based filters to cut unnecessary runs.
- Run the Quick Test below to confirm your understanding.
Mini challenge
Your API tests take 25 minutes on PRs. Reduce it under 8 minutes without losing coverage. Hint: shard tests, run smoke first, cache dependencies, and skip heavy suites on PRs while keeping them on main/nightly schedules.