Why this matters
As a Machine Learning Engineer, you ship models as services, batch jobs, or CLIs. Docker images are your deployment units. Clear, consistent tags let your team roll forward or roll back safely, reproduce results, and keep registries clean. This reduces downtime, prevents model mix-ups, and speeds up CI/CD.
- Release inference servers with semantic versions (e.g., 1.4.2)
- Pin training jobs to known-good images via digests
- Clean old images to save storage and avoid confusion
Concept explained simply
What is an image tag?
A tag is a human-readable label pointing to a specific image version. Multiple tags can point to the same image. Example: myorg/infer:1.2.0 and myorg/infer:latest might be the same image today—but later they can diverge.
Digest vs tag
Digest (sha256:...) is content-addressed and immutable. Tags are pointers; they can be moved. Use tags for humans, digests for guarantees.
Mental model
- Tag = sticky note you can move
- Digest = fingerprint you cannot change
- Repository = folder of images (with many tags)
Pro tip: naming and versioning
- Prefer semantic versioning: MAJOR.MINOR.PATCH (e.g., 0.3.1)
- Add build metadata: 0.3.1-
gitshaor 0.3.1-cpu - Avoid relying on
latestin production
Core commands you will use
# List images
docker images
# Build with a tag
docker build -t myorg/infer:0.1.0 .
# Add/retag an existing image
docker tag myorg/infer:0.1.0 myorg/infer:latest
# Push/Pull
docker push myorg/infer:0.1.0
docker pull myorg/infer:0.1.0
# Inspect tags and digests
docker inspect --format='{{.RepoTags}} {{.RepoDigests}}' myorg/infer:0.1.0
# Remove images
docker rmi myorg/infer:0.1.0
# Clean dangling layers
docker image prune -f
Worked examples
Example 1: Build and tag your inference image
- Build the image:
docker build -t mlteam/infer:0.4.0 . - Create an additional tag for the same image:
docker tag mlteam/infer:0.4.0 mlteam/infer:0.4.0-b3a1c9e - Optionally, move latest:
docker tag mlteam/infer:0.4.0 mlteam/infer:latest - Check:
docker images | grep mlteam/infer
Example 2: Pin by digest for training reproducibility
- Get digest of a tag:
docker pull mlteam/infer:0.4.0 docker inspect --format='{{index .RepoDigests 0}}' mlteam/infer:0.4.0 - Use that digest in a job definition:
mlteam/infer@sha256:... # immutable
Example 3: Retag and push to a registry namespace
- Retag local image for your registry:
docker tag mlteam/infer:0.4.0 registry.example.com/mlteam/infer:0.4.0 - Push:
docker push registry.example.com/mlteam/infer:0.4.0
Example 4: Cleanup unused images
- List images not used by any container:
docker image prune -a --filter "until=168h" - Remove specific tag or ID:
docker rmi mlteam/infer:0.2.1
Safety notes for cleanup
- Ensure no running containers depend on the image
- In CI, pruning is safe on ephemeral runners
- On shared hosts, coordinate with teammates
Who this is for
- ML engineers shipping models as containers
- Data scientists moving notebooks to production jobs
- DevOps/Platform engineers supporting ML workloads
Prerequisites
- Basic Docker CLI (build, run)
- Familiarity with Dockerfiles and .dockerignore
- Semantic versioning basics
Learning path
- Tagging and listing images (this lesson)
- Multi-stage builds and slimming images
- GPU/CUDA base images and compatibility
- Private registries, auth, and CI/CD tagging strategies
- Security scanning and SBOMs
Common mistakes and self-check
- Using only
latestin prod. Fix: deploy immutable tags/digests and movelatestonly after success. - Overwriting a version tag. Fix: treat version tags as immutable; create a new patch version instead.
- Not pinning base images. Fix: use
python:3.10-slim@sha256:...or a specific version tag. - Huge images due to missing
.dockerignore. Fix: ignore data, venvs, caches, and notebooks not needed at runtime. - Confusing repository names. Fix: use lowercase and team/repo naming consistently (e.g.,
mlteam/infer).
Self-check:
- You can show three distinct tags pointing to one image ID
- You can pull by digest and verify the same image ID
- Your cleanup removes only images you intend to drop
Practical projects
- Inference microservice image: Build, tag with semver and git SHA, push to a registry, and document rollback by digest.
- Training job image: Pin base image and requirements, tag with dataset version suffix, and run a reproducible training.
- Cleanup and retention: Write a script to prune images older than 2 weeks except the last 3 releases.
Exercises
Note: Everyone can do the exercises and test for free. Only logged-in users get saved progress.
Exercise 1 (ex1): Tag, retag, and inspect
- Build an image:
docker build -t ds/infer:0.1.0 . - Create two more tags:
docker tag ds/infer:0.1.0 ds/infer:0.1.0-abc123 docker tag ds/infer:0.1.0 ds/infer:latest - Inspect tags and digest:
docker inspect --format='{{.RepoTags}} {{.RepoDigests}}' ds/infer:0.1.0
Expected outcome
RepoTags: [ds/infer:0.1.0 ds/infer:0.1.0-abc123 ds/infer:latest]
RepoDigests: [ds/infer@sha256:...]
All tags point to the same image ID.Exercise 2 (ex2): Push, pull by digest, and clean
- Retag for your registry namespace (replace with yours):
docker tag ds/infer:0.1.0 registry.local/ds/infer:0.1.0 - Push and capture digest:
docker push registry.local/ds/infer:0.1.0 # then: docker inspect --format='{{index .RepoDigests 0}}' registry.local/ds/infer:0.1.0 - Pull by digest:
docker pull registry.local/ds/infer@sha256:... - Clean older local images not used by containers:
docker image prune -f
- Checklist:
- Digest recorded and used
- Pull by digest succeeds
- No unintended images removed
Mini challenge
You have two tags: mlorg/feat-extract:1.2.0 and mlorg/feat-extract:latest. After a failed deploy, roll back production to the previous digest while keeping latest for continued testing. Outline the exact commands you would run.
Hint
Find the previous digest from history (or a release note), pull by digest, retag a stable tag (e.g., stable) or update your deployment to the digest directly.
Next steps
- Adopt multi-stage builds and caching for faster CI
- Use immutable digests in production manifests
- Automate tagging in CI with semver and commit SHAs