Who this is for
- Platform Engineers who build base images and CI/CD pipelines.
- Backend Engineers shipping services to Kubernetes or ECS.
- SRE/SecOps reducing image attack surface and supply-chain risk.
Prerequisites
- Basic Docker knowledge: images, containers, Dockerfile, docker build/run.
- Command-line comfort (bash/sh).
- Optional: Familiarity with your app stack (Go/Node/Python) helps.
Why this matters
In real platforms, containers are the delivery unit. Poorly built images cause slow deployments, runtime breakages, security incidents, and high costs. As a Platform Engineer, you will:
- Create secure base images and golden Dockerfile patterns.
- Enforce non-root containers and minimal images in CI.
- Scan and fix vulnerabilities before images hit production.
- Ship reproducible builds with SBOMs and signatures.
Concept explained simply
Container build and hardening is about producing small, predictable, and safe images. You keep only what you need to run, drop everything else, and make it difficult for attackers (or mistakes) to cause harm.
Mental model
Think of your image as a sealed lunch box for your app. You want:
- Only the food you eat (runtime files), not the whole kitchen (build tools).
- A clearly labeled box (tags, digests, SBOM) so you know what's inside.
- A child-safe lid (non-root, least privilege) so accidents don't spread.
Core principles
Use minimal, trusted base images
- Prefer small base images (alpine, distroless, scratch if static).
- Pin by digest to avoid surprises: FROM alpine:3.19@sha256:...
Multi-stage builds
- Build in one stage with tools, copy only final artifacts into a clean runtime stage.
- Keeps images small and reduces vulnerabilities.
Run as non-root
- Declare USER in the Dockerfile (e.g., USER 10001).
- Ensure folders your app writes to are owned by that user.
Pin and verify dependencies
- Use lockfiles (package-lock.json, go.sum, poetry.lock).
- Verify checksums for downloads; avoid curl | bash patterns.
Avoid secrets in images
- Never bake API keys or tokens into layers (ARG and ENV are not secret stores).
- Use runtime secret stores or orchestrator features.
.dockerignore and layer hygiene
- Exclude .git, node_modules, tests, and local files to speed builds and avoid leaks.
- Order Dockerfile steps to maximize cache hits.
Health and least privilege
- Add a HEALTHCHECK when feasible.
- At runtime, use --read-only and --cap-drop=ALL, adding only the few caps you need.
Scan, SBOM, and sign
- Scan images for vulnerabilities before pushing.
- Generate an SBOM and store it with the artifact.
- Sign images for provenance and integrity.
Worked examples
Example 1 — Go service with multi-stage and distroless
Build a static Go binary and ship it in a tiny distroless image.
# Dockerfile
FROM golang:1.22-alpine AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-s -w" -o app ./cmd/app
# Final: minimal runtime (non-root)
FROM gcr.io/distroless/static:nonroot
WORKDIR /app
COPY --from=build /src/app /app/app
USER nonroot:nonroot
ENTRYPOINT ["/app/app"]
# Expose a /health HTTP endpoint in the app for container health.
- Security gains: small attack surface, no shell, non-root.
- Performance: very small image, fast pull/start.
Example 2 — Node.js API with .dockerignore and non-root
# .dockerignore
node_modules
npm-debug.log
.git
Dockerfile
.dockerignore
# Dockerfile
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
FROM node:20-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Final runtime (optionally distroless node)
FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app/dist ./dist
COPY package.json ./
COPY --from=deps /app/node_modules ./node_modules
USER 10001
EXPOSE 3000
CMD ["dist/server.js"]
- Security gains: no dev deps in final image, non-root user.
- Reliability: deterministic install with lockfile and npm ci.
Example 3 — NGINX static site, non-root, pinned base
# Dockerfile
FROM nginx:1.25-alpine@sha256:... as base
# Add curl only in a separate stage if you need health checks during build
# Prepare non-root runtime
RUN adduser -D -H -u 101 web && \
mkdir -p /var/cache/nginx /var/run /usr/share/nginx/html && \
chown -R 101:101 /var/cache/nginx /var/run /usr/share/nginx/html
COPY nginx.conf /etc/nginx/nginx.conf
COPY ./public /usr/share/nginx/html
USER 101
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
CMD wget -qO- http://127.0.0.1:8080/ >/dev/null || exit 1
CMD ["nginx","-g","daemon off;"]
nginx.conf should listen on 8080 and not require privileged ports.
Runtime tip: run with read-only root and least capabilities:
docker run --read-only --cap-drop=ALL -p 8080:8080 mysite:latest
Reproducibility, scanning, and SBOM
- Reproducible builds: pin base images by digest, use lockfiles, avoid time-dependent build steps.
- Scan every build: use a trusted image scanner in CI and fail builds above a severity threshold.
- Generate SBOM: produce an SPDX or CycloneDX document and store it with the artifact.
- Sign images: sign the image digest so deployers trust source and integrity.
Typical CI checks to add
- docker build with target=final and cache.
- Image scan step: fail on High/Critical vulns unless explicitly allowed.
- SBOM generation and upload as a build artifact.
- Image signature; push only signed images to production registry.
Exercises
Do these hands-on tasks. They mirror the exercises below. Timebox each to 25–35 minutes.
Exercise 1 — Multi-stage Go image under 20 MB (id: ex1)
- Create a tiny Go HTTP server exposing /health.
- Write a multi-stage Dockerfile that compiles a static binary.
- Use a distroless or scratch final stage and run as non-root.
- Build and run. Ensure it starts and serves /health.
- Success criteria: image <= 20 MB, USER not root, responds 200 on /health.
Exercise 2 — Harden a Node.js Dockerfile (id: ex2)
- Start from an insecure Dockerfile (root user, copies everything, dev deps included).
- Add .dockerignore to exclude junk and secrets.
- Use multi-stage, npm ci, and NODE_ENV=production.
- Switch to a non-root user and expose only needed port.
- Add a HEALTHCHECK if feasible.
- Success criteria: smaller image, non-root, no dev deps, passes a simple health probe.
Pre-push checklist
- Base image pinned by digest (or at least a narrow version tag).
- Multi-stage used; no build tools in final image.
- Non-root USER declared; writable dirs owned correctly.
- .dockerignore present and effective.
- No secrets in Dockerfile, ENV, or copied files.
- Healthcheck defined or app exposes a health endpoint.
- Image scanned; no unapproved Critical/High findings.
- SBOM generated and stored; image signed if your process supports it.
Common mistakes and self-check
- Using latest tags. Fix: pin versions/digests and update intentionally.
- Shipping build tools in the final image. Fix: multi-stage and copy only artifacts.
- Running as root. Fix: create a user and set USER; adjust permissions.
- Copying the entire context. Fix: add .dockerignore and explicit COPY lines.
- Cache busting. Fix: order Dockerfile steps (COPY lockfiles before source).
- Unverified downloads. Fix: use checksums and verified sources.
- Secrets in layers. Fix: use runtime secrets; never bake them into images.
- No scanning. Fix: add scanner step in CI and fail on high severity.
Self-check prompts
- Can I explain every file present in the final image?
- Can this container start with read-only root and still work?
- What happens if the network is restricted? Did I remove unnecessary tools?
- If the base image updates, do I get notified and rebuild safely?
Mini challenge — 15-minute hardening review
Pick one of your services. In 15 minutes, make two improvements.
- Replace the base image with a pinned minimal variant.
- Add a non-root user and fix permissions.
Stretch goals
- Add a HEALTHCHECK.
- Generate an SBOM and attach it to the artifact output.
Practical projects
- Golden Dockerfile library: Create three templates (Go, Node, Python) with multi-stage, non-root, and healthchecks. Share with your team.
- CI guardrails: Add image scanning and SBOM generation to a pipeline. Fail on critical findings and store SBOM artifacts.
- Base image refresh bot: Script that proposes PRs to bump base image digests weekly and triggers rebuilds.
- Runtime policy test: Prove your service runs with --read-only and --cap-drop=ALL, documenting any required exceptions.
Learning path
- Before this: Container basics, Dockerfiles, image layers, registries.
- This subskill: Build patterns, hardening practices, SBOM, scanning.
- Next: Image distribution and registry policies; Runtime security in Kubernetes (Pod Security, PSP replacements, seccomp/AppArmor); Supply-chain security (signing, attestations).
Next steps
- Refactor one production service to follow these patterns.
- Add scanner + SBOM to CI for at least one repo.
- Document a team-wide Dockerfile checklist based on this page.
Quick Test
Take the quick test to check your understanding. Available to everyone; sign in to save your progress.