How to learn Environment Reproducibility for MLOps Basics in Machine Learning Engineer for free

Who this is for

Junior to mid-level Machine Learning Engineers who need consistent results across laptops, servers, and CI.
Data Scientists preparing models to hand off to engineering.
MLOps/Platform engineers standardizing environments for teams.

Prerequisites

Basic Python and command line usage.
Familiarity with pip or conda.
Optional: basic Docker knowledge.

Why this matters

Environment reproducibility is your guarantee that code behaves the same on every machine. In the ML lifecycle, it prevents "works on my machine" issues and makes debugging, collaboration, and deployment predictable.

Real task: Train a model on a GPU server today, retrain next month with the same results.
Real task: Share a project with pinned dependencies so teammates can run it without surprises.
Real task: Rebuild a serving container identically to match the model you validated.

Concept explained simply

Think of your ML project like baking. The recipe is your code. The ingredients are your libraries and system packages. The oven is your OS/CPU/GPU. Reproducibility means you precisely list ingredients, their brands and versions, and control the oven settings so the cake always turns out the same.

Mental model

Pin: exact versions for everything you can (packages, base images, CUDA versions, Python version).
Isolate: avoid polluting global systems (use venv/conda/containers).
Record: save lockfiles, hashes, and metadata used to run.
Verify: recreate in a clean environment and compare outcomes.

Core components of a reproducible ML environment

Version pinning: Use exact versions (e.g., pandas==2.2.0). Prefer lockfiles (poetry.lock, requirements.txt with exact pins, conda lock files).
Environment isolation: Python venv, conda environments, or containers (Docker).
System-level dependencies: Capture OS libs (e.g., libgomp, gcc) via containers or documented setup steps.
Base images: Pin Docker base images by version tag or digest for stability.
Randomness control: Set seeds across libraries (random, numpy, torch) and use deterministic backends where possible.
Data versioning: Reference immutable data snapshots (e.g., by checksum or versioned path) so training inputs don’t change unexpectedly.
Config management: Centralize configuration in files (YAML/TOML) and avoid hidden environment differences.
Verification: Rebuild from scratch on a clean machine/CI and run sanity checks.

Sample seed setup (deterministic where possible)

import os, random, numpy as np
SEED = 42
random.seed(SEED)
os.environ["PYTHONHASHSEED"] = str(SEED)
np.random.seed(SEED)
try:
    import torch
    torch.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
except ImportError:
    pass

Worked examples

Example 1: Reproducible Python venv with pinned requirements

Create a clean environment:

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
python -m pip install --upgrade pip

Pin versions in requirements.txt:

echo "numpy==1.26.4" > requirements.txt
echo "pandas==2.2.0" >> requirements.txt
echo "scikit-learn==1.4.0" >> requirements.txt

Install and freeze a lockfile (optional but helpful):

pip install -r requirements.txt
pip freeze > requirements.lock

Recreate on another machine using the lockfile for exact transitive deps:
```
pip install --no-deps --require-virtualenv -r requirements.lock
```

Tip: requirements.txt pins your direct deps; requirements.lock pins every package including transitive deps.

Example 2: Reproducible conda environment

Create environment.yml with exact versions:

name: mlproj
channels:
  - conda-forge
dependencies:
  - python=3.10.13
  - numpy=1.26.4
  - pandas=2.2.0
  - scikit-learn=1.4.0

Create the environment:

conda env create -f environment.yml
conda activate mlproj

Export an exact spec for rebuilds:

conda list --explicit > conda-spec.txt
# then later: conda create --name mlproj2 --file conda-spec.txt

Note: The explicit spec file locks exact build strings and channels, improving reproducibility.

Example 3: Reproducible Docker image for training

Create a Dockerfile with a pinned base image and explicit versions:

FROM python:3.10-slim@sha256:REPLACE_WITH_DIGEST
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir --upgrade pip \
    && pip install --no-cache-dir -r requirements.txt
COPY train.py /app/train.py
CMD ["python", "train.py"]

requirements.txt (pinned):

numpy==1.26.4
pandas==2.2.0
scikit-learn==1.4.0

Build and run:

docker build -t ml-train:1 .
docker run --rm ml-train:1 python -c "import pandas, sklearn; print('OK')"

Pinning the base image by digest prevents upstream tag drift.

Reproducibility checklist

[ ] Python version pinned (e.g., 3.10.13)
[ ] Dependencies pinned exactly; lockfile saved
[ ] Environment isolated (venv/conda/container)
[ ] Random seeds set; deterministic flags used where possible
[ ] Base image/version pinned (if using Docker)
[ ] Data snapshot referenced immutably (path/checksum/version)
[ ] Config file committed (YAML/TOML) instead of hidden env-only settings
[ ] Rebuild verified on a clean machine/CI

How to verify quickly

Create a fresh venv or new container.
Install from lockfile or build from Dockerfile.
Run a short script to print library versions and a hash of a small dataset/model artifact.
Compare against the expected versions and hash.

Common mistakes and self-checks

Mistake: Using floating dependency ranges (e.g., pandas>=2.0). Self-check: Is every package pinned with == in the file that others will use?
Mistake: Relying only on requirements.txt without a lockfile. Self-check: Do you have a frozen list (pip freeze or explicit conda spec)?
Mistake: Forgetting system libs. Self-check: Can you rebuild in a minimal container successfully?
Mistake: Not setting seeds. Self-check: Do repeated runs produce equivalent metrics within expected noise?
Mistake: Mixing global and project environments. Self-check: Does deactivating your venv break your run? It should, otherwise you’re leaking globals.
Mistake: Tag-only Docker base images (e.g., latest). Self-check: Is your base image pinned by version or digest?

Exercises

Note: Anyone can do the exercises and take the quick test; only logged-in users get saved progress.

Exercise 1 (mirrors ex1): Pin and recreate a Python environment

Create a venv and upgrade pip.
Create requirements.txt with exact versions for numpy, pandas, scikit-learn (choose versions that are compatible).
Install, then freeze to requirements.lock.
Delete the venv, recreate it, and install from requirements.lock.
Run a script printing imported package versions and confirm they match.

What to submit for yourself

requirements.txt and requirements.lock
Console output showing version prints matching the lockfile

Exercise 2 (mirrors ex2): Minimal reproducible Docker image

Write a Dockerfile using a pinned Python slim image (specify a version tag; if you know the digest, pin it).
Copy a pinned requirements.txt and install.
Create app.py that prints library versions and a seeded random number.
Build and run the container twice; verify identical outputs for versions and the random number.

What to submit for yourself

Dockerfile and requirements.txt
Two identical runs of the container output

Practical projects

Project A: Reproducible training baseline. Create a small training script (e.g., Iris classification) with seeds, pinned env, and a script that prints a checksum of the trained model file. Verify the same checksum across two clean rebuilds.
Project B: Data snapshot runner. Add a tiny dataset (or generate synthetic) and compute its SHA256 before training; fail the run if the checksum differs from a stored value.
Project C: CI environment test. Write a script that rebuilds the env from lockfiles in a clean environment and runs a smoke test to confirm everything loads.

Learning path

Environment isolation and pinning (this lesson)
Data versioning and artifact tracking
Experiment tracking and configuration management
CI checks for reproducibility and drift detection
Deployment with pinned containers and staged rollouts

Next steps

Adopt a lockfile workflow in all new ML repos.
Add a reproducibility check script that verifies versions, seeds, and data hashes.
Integrate a CI job that rebuilds from scratch and runs a smoke test.

Mini challenge

Given an existing ML repo that only has requirements.txt with version ranges, make it reproducible. Deliver: pinned requirements, a lockfile, a seed setup, and a short README section describing how to rebuild and verify determinism.

Menu

Environment Reproducibility

Table of Contents