Why this matters
As an MLOps Engineer, you ship training and inference code inside containers. Unpinned or loosely pinned dependencies can silently change and break builds, produce different model behavior, or expose security issues. Lockfiles give you deterministic, auditable builds. That means faster rollbacks, reproducible experiments, and stable production images.
- Real tasks: reproduce a previous training run exactly
- Real tasks: build minimal, secure inference images that install the same packages every time
- Real tasks: roll back to a known-good dependency set when a library releases a breaking patch
Who this is for
- Engineers packaging ML apps in Docker (training, batch, or serving)
- Data scientists handing off projects to production
- Teams standardizing environments across CI, dev, and prod
Prerequisites
- Basic Docker knowledge (Dockerfile, layers, caching)
- Familiarity with Python packaging (pip, virtualenv) or conda
- Comfort running commands in a terminal
Concept explained simply
Pinning means choosing exact versions for all packages so they do not change unexpectedly. A lockfile is a snapshot of every package and version (including transitive dependencies) that your project uses. Build images from a lockfile and you get the same environment every time.
A helpful mental model
Think of your project dependencies as a recipe. A requirements.txt without exact versions is like a recipe that says “use some flour.” A lockfile says “use 500g of Brand X flour, lot 1234.” The result is repeatable and testable.
Core practices you will use
- Generate a lockfile from a minimal spec: keep a human-edited file (for example, requirements.in or pyproject.toml) and compile it into a machine lock (requirements.txt with hashes or poetry.lock).
- Install from the lockfile in Docker: copy only the lockfile before running install to maximize layer cache.
- Use hashes when possible: pip can verify file hashes to prevent supply-chain surprises.
- Separate dev and prod: keep dev-only dependencies out of production images.
- Update on purpose: refresh the lock on a schedule or when needed, never silently during a production build.
- Pin OS packages too: apt, apk, or conda packages should also be versioned when used in images.
Tip: fast and deterministic Docker builds
- Order your Dockerfile so that the lockfile is copied before source code.
- Install dependencies from the lockfile, then copy app code. This preserves cached layers when code changes.
- Use multi-stage builds to avoid shipping build tools (pip-tools, compilers) in the final image.
Worked examples
Example 1: pip-tools and hash-locked requirements for a Python service
Goal: produce a deterministic, hash-verified install inside a Docker image.
Files and commands
requirements.in (human-maintained):
fastapi
uvicorn[standard]
scikit-learn==1.3.*
numpy>=1.25,<2.0
Dockerfile:
FROM python:3.11-slim AS builder
WORKDIR /app
RUN pip install --upgrade pip setuptools wheel pip-tools
COPY requirements.in .
# Compile exact versions with hashes
RUN pip-compile --generate-hashes --output-file requirements.txt requirements.in
FROM python:3.11-slim
WORKDIR /app
ENV PIP_NO_CACHE_DIR=1
COPY --from=builder /app/requirements.txt ./requirements.txt
# Verify integrity with hashes
RUN pip install --require-hashes -r requirements.txt
COPY app/ ./app
CMD ["python", "-m", "app.main"]
Result: requirements.txt contains exact versions and hashes. Docker installs exactly those artifacts every time.
Example 2: Poetry project exported to a runtime lock
Goal: use Poetry for dependency resolution but ship a container that only needs pip.
Files and commands
pyproject.toml and poetry.lock are maintained locally. In a multi-stage Docker build, export a runtime lockfile and install with pip:
FROM python:3.11-slim AS builder
WORKDIR /app
RUN pip install --upgrade pip setuptools wheel poetry
COPY pyproject.toml poetry.lock ./
# Export exact versions to a pip-compatible file
RUN poetry export --format=requirements.txt --without-hashes --output=requirements.txt
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /app/requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "-m", "app.main"]
Note: Poetry export pins exact versions. Hash verification adds extra integrity (not always available in export). You still gain deterministic versions across builds.
Example 3: Conda explicit lock for platform-reproducible builds
Goal: lock exact conda package builds for Linux containers.
Files and commands
environment.yml (human-maintained):
name: mlservice
dependencies:
- python=3.10
- scikit-learn=1.3
- numpy=1.26
- pip
Create a platform-specific explicit lock inside a builder container:
# After creating environment once
conda create -n lockenv -y python=3.10
conda activate lockenv
conda env update -n lockenv -f environment.yml
# Produce explicit lock for Linux-64
conda list --explicit > conda-linux-64.lock
Dockerfile snippet (runtime):
FROM mambaorg/micromamba:1.5.8
COPY --chown=micromamba:micromamba conda-linux-64.lock /tmp/conda.lock
RUN micromamba create -y -n appenv --file /tmp/conda.lock && \
micromamba clean -a -y
SHELL ["/usr/local/bin/_entrypoint.sh", "bash", "-lc"]
RUN python -V
Result: exact conda builds are installed for the target platform, ensuring reproducibility.
Exercises
Complete the tasks below, then open the Quick Test. The test is available to everyone; only logged-in users will have their progress saved.
Exercise 1 — Deterministic pip install with a lockfile
Mirror of the exercise card titled: Pin and build a Python service with pip-tools.
Exercise 2 — Recover from a breaking upstream release
Mirror of the exercise card titled: Fix a failing build by introducing a lockfile and verifying reproducibility.
Exercise completion checklist
- You produced a lockfile from a minimal spec
- Docker installs from the lockfile, not from unpinned specs
- Rebuild yields the same dependency versions
- You verified that adding a lock fixed a flaky build
Common mistakes and how to self-check
- Installing from pyproject.toml or requirements.in in production. Self-check: Your Dockerfile should install from a compiled lock (requirements.txt, poetry export, or explicit conda lock).
- Copying source code before installing dependencies. Self-check: In Dockerfile, COPY the lockfile first, then run install, then copy source.
- Forgetting OS-level pins. Self-check: Ensure apt/apk packages are pinned and non-interactive installs are used.
- Letting CI update dependencies silently. Self-check: Builds should never mutate the lockfile; updates happen in a controlled step.
- No hashes with pip when feasible. Self-check: When using pip-compile, enable --generate-hashes and install with --require-hashes.
Practical projects
- Training vs inference images: create separate lockfiles and Dockerfiles. Verify that both rebuild deterministically.
- Wheelhouse build: create a builder stage that downloads wheels based on a lockfile, then install from the local wheelhouse in the final image.
- OS-level pinning: add pinned apt packages (like libgomp, libopenblas) and verify the image digest only changes when you update versions.
Learning path
- Start: pin Python dependencies with pip-tools or Poetry export
- Next: add OS package pinning to the Dockerfile
- Then: lock per environment (dev, CI, prod) and separate optional extras
- Advanced: build multi-arch images and keep per-arch locks as needed
Mini challenge
Take a small ML inference service and reduce build time while keeping deterministic installs. Hint: export or compile a lockfile, copy it before install, and use a multi-stage build so tooling stays out of the final image.
Next steps
- Automate a scheduled lockfile refresh (for example, monthly) and run tests against it before merging
- Track image digests and lockfile diffs to audit changes
- Proceed to the Quick Test to confirm mastery