How to learn Virtual Environments And Dependency Management for Python in Machine Learning Engineer for free

Why this matters

As a Machine Learning Engineer, you regularly switch between projects that need different Python and library versions. Without isolation and version control you get conflicts, broken notebooks, and unreproducible results. Virtual environments and dependency management solve this by making your experiments and deployments consistent and repeatable.

Real tasks: reproduce a teammate’s training run exactly
Real tasks: update pandas safely without breaking feature engineering
Real tasks: prepare a clean inference environment for deployment

What you’ll be able to do after this lesson

Create and activate virtual environments quickly
Pin exact package versions and freeze them to files
Rebuild the same environment on any machine
Connect your venv to Jupyter for reliable notebooks

Concept explained simply

A virtual environment is a private folder containing its own Python interpreter and packages. It isolates your project from the system and other projects. Dependency management is the practice of choosing, pinning, and recording package versions so others (and future you) can recreate the same environment.

Mental model

Think of each project as a "sealed lab" (the virtual environment). Everything you install lives in that lab.
Your lab’s "shopping list" (requirements.txt) records exact package versions you used.
Anyone can rebuild the lab by creating a new venv and installing from your shopping list.

Key tools you’ll use

python -m venv: create isolated environments
Activate scripts: source .venv/bin/activate (macOS/Linux), .venv\Scripts\activate (Windows)
pip install / pip uninstall: add or remove packages
pip freeze > requirements.txt: capture exact versions
pip install -r requirements.txt: reproduce the environment
Optional advanced: constraints.txt for controlled upgrades; ipykernel for Jupyter integration

Worked examples

Example 1 — Create a clean venv and install packages

Make a project folder and venv:

mkdir ml-env-demo
cd ml-env-demo
python -m venv .venv

Activate it:

macOS/Linux:
```
source .venv/bin/activate
```
Windows (PowerShell):
```
.venv\Scripts\Activate.ps1
```

Install exact versions (stable example):

pip install numpy==1.26.4 scikit-learn==1.3.2

Verify:

python -c "import numpy, sklearn; print(numpy.__version__, sklearn.__version__)"

Example 2 — Freeze and reproduce

Freeze the environment:
```
pip freeze > requirements.txt
```

Create a brand-new venv to simulate a teammate’s machine:

deactivate  # if active
python -m venv .venv2
source .venv2/bin/activate  # or .venv2\Scripts\Activate.ps1 on Windows
pip install -r requirements.txt

Confirm versions match:

python -c "import numpy, sklearn; print(numpy.__version__, sklearn.__version__)"

Example 3 — Controlled upgrades with constraints

constraints.txt lets you pin transitive dependencies while you upgrade a top-level package.

Create constraints from your current lock:
```
pip freeze > constraints.txt
```

Upgrade a single package while holding others steady:

pip install --upgrade pandas==2.1.4 -c constraints.txt

Test your code, then regenerate requirements.txt if all good:
```
pip freeze > requirements.txt
```

Example 4 — Use your venv in Jupyter

Install ipykernel inside the active venv:
```
pip install ipykernel
```

python -m ipykernel install --user --name ml-env-demo --display-name "Python (ml-env-demo)"

Open Jupyter and select "Python (ml-env-demo)" kernel to ensure the notebook uses this environment.

Who this is for

Machine Learning Engineers who run multiple projects and experiments
Data Scientists transitioning from notebooks to production workflows
Anyone who needs repeatable training and inference environments

Prerequisites

Basic command line usage
Python installed (3.9+ recommended)
pip available (bundled with recent Python)

Learning path

Create and activate virtual environments
Install specific package versions and verify
Freeze to requirements.txt and rebuild elsewhere
Use constraints.txt for controlled upgrades
Connect venvs to Jupyter kernels
Apply to a small ML project and share with a teammate

Exercises

Complete these hands-on tasks. Everyone can take the test; saved progress is available to logged-in users.

Exercise 1 — Clean venv, pin versions, print versions

Create a folder named project-a, then a venv named .venv.

Activate it and install:

pip install numpy==1.26.4 scikit-learn==1.3.2

Create print_versions.py:

import numpy, sklearn
print("NumPy:", numpy.__version__)
print("scikit-learn:", sklearn.__version__)

Run the script and confirm versions print.

Exercise 2 — Freeze and reproduce in a fresh venv

In project-a, run:
```
pip freeze > requirements.txt
```
Deactivate, create .venv2, activate it.
Run:
```
pip install -r requirements.txt
```
Re-run print_versions.py and confirm same versions.

Checklist — I did this

I created and activated a venv without errors
I installed exact package versions
I froze dependencies to requirements.txt
I rebuilt the environment successfully
My reproduced versions matched exactly

Common mistakes and self-check

Using system Python for everything

Risk: breaking system tools or mixing dependencies. Fix: always create a venv per project and activate before installing.

Not pinning versions

Risk: silent upgrades change results. Fix: pin with == and commit requirements.txt; rebuild with -r.

Mixing environments

Symptom: python uses a different interpreter than pip. Self-check: run which python and which pip (or where on Windows). They should both point inside your venv.

Forgetting Jupyter kernel binding

Symptom: notebook imports differ from terminal. Fix: install ipykernel in the venv and select the correct kernel.

Upgrading everything at once

Risk: large surface area for breakage. Fix: use constraints.txt and upgrade one package at a time with tests.

Practical projects

Reproducible notebook: Train a small classifier on Iris, freeze requirements, and rebuild environment on a second venv to verify identical accuracy.
Controlled upgrade: Start with a working pandas-based feature pipeline, then upgrade pandas using constraints.txt and ensure unit tests still pass.
Team handoff: Package a minimal inference script and requirements.txt that a teammate can run in a fresh venv to replicate your predictions.

Next steps

Automate environment setup with a simple setup script (create venv, install -r)
Add pre-commit checks to prevent accidental unpinned installs
Learn packaging basics (pyproject.toml) when turning code into reusable modules

Mini challenge

Create a tiny ML project that trains LogisticRegression on Iris. Pin versions, save requirements.txt, and write a short README with 3 commands: create venv, install -r, run script. Ask a peer to reproduce the same accuracy (±1e-6). If results differ, investigate differences in Python version, BLAS, or package pins.

Menu

Virtual Environments And Dependency Management

Table of Contents

Why this matters

Concept explained simply

Mental model

Key tools you’ll use

Worked examples

Who this is for

Prerequisites

Learning path

Exercises

Exercise 1 — Clean venv, pin versions, print versions

Exercise 2 — Freeze and reproduce in a fresh venv

Common mistakes and self-check

Practical projects

Next steps

Mini challenge

Practice Exercises

Create a clean venv, pin versions, and verify

Instructions

Expected Output

Freeze and reproduce in a fresh environment

Virtual Environments And Dependency Management — Quick Test

Have questions about Virtual Environments And Dependency Management?

AI Assistant