luvv to helpDiscover the Best Free Online Tools
Topic 1 of 7

Dockerfiles For Training And Serving

Learn Dockerfiles For Training And Serving for free with explanations, exercises, and a quick test (for MLOps Engineer).

Published: January 4, 2026 | Updated: January 4, 2026

Why this matters

MLOps Engineers frequently containerize two critical workloads: model training (often GPU-accelerated, data-heavy) and model serving (fast, secure, and lightweight). Well-structured Dockerfiles make builds faster, images smaller, deployments repeatable, and incidents easier to debug. In real projects you will:

  • Package training jobs with pinned dependencies and reproducible environments.
  • Ship serving images that start fast, are secure (non-root), and expose the right ports.
  • Use caching and multi-stage builds to keep images small and CI builds fast.
  • Handle GPUs, model artifacts, and configuration cleanly across environments.

Who this is for

Engineers and practitioners who need reliable containers for ML training and inference—especially those integrating with CI/CD, orchestration, and registries.

Prerequisites

  • Basic Docker commands (build, run, push, tag).
  • Comfortable with Python project structure.
  • Familiarity with training scripts and simple web servers (FastAPI/Uvicorn or Flask/Gunicorn).

Concept explained simply

A Dockerfile is a recipe for your runtime environment. For training, it should reproduce your experiment reliably and efficiently. For serving, it should boot a web service with your model as quickly and securely as possible. The main difference: training images emphasize toolchains and data access; serving images emphasize minimal size, startup speed, and security.

Mental model

Think of layers like a stack of cached steps. The lower layers change rarely (base image, system packages), while the top layers change often (your code). Order instructions so that the least changing layers come first and the most changing ones come last. This maximizes cache hits and speeds up builds.

Key components you will use

  • FROM: choose a slim base; for GPU use an NVIDIA CUDA base.
  • WORKDIR: set a stable working directory.
  • COPY/ADD: copy only what you need; use .dockerignore.
  • RUN: install system deps and Python packages; clean caches.
  • ENV/ARG: pass configuration and build-time values.
  • USER: run as non-root for security.
  • EXPOSE: document service port (serving).
  • CMD vs ENTRYPOINT: use CMD for defaults users can override; ENTRYPOINT for required commands.
  • Multi-stage builds: build in a heavy stage, copy into a slim final stage.

Worked examples

Example 1 — Training image (CPU) with caching
# Dockerfile.train
FROM python:3.11-slim as base

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# System deps first (rarely change)
RUN apt-get update \ 
    && apt-get install -y --no-install-recommends build-essential \ 
    && rm -rf /var/lib/apt/lists/*

# Requirements next for better caching
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Then copy source (changes frequently)
COPY . .

# Create non-root user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser

# Default command: one-epoch demo training
CMD ["python", "train.py", "--epochs", "1", "--output", "/outputs/model.bin"]

Notes: order improves cache; non-root user boosts security; outputs are written to a mounted volume like /outputs.

Example 2 — Serving image (FastAPI + Uvicorn/Gunicorn)
# Dockerfile.serve
FROM python:3.11-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PORT=8080 \
    MODEL_DIR=/models

RUN apt-get update \ 
    && apt-get install -y --no-install-recommends build-essential \ 
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements-serve.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy only serving code, not training extras
COPY api/ ./api/

# Add user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser

EXPOSE 8080

# Use gunicorn with uvicorn workers for production
CMD ["gunicorn", "api.main:app", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8080", "--workers", "2", "--timeout", "60"]

Notes: Model files are provided at runtime via a mounted volume to MODEL_DIR. Start command is production-ready.

Example 3 — Multi-stage build to keep serving image small
# Stage 1: build wheels for native deps
FROM python:3.11 as build
WORKDIR /wheels
COPY requirements-serve.txt .
RUN pip wheel --wheel-dir=/wheels -r requirements-serve.txt

# Stage 2: minimal runtime
FROM python:3.11-slim
ENV PYTHONUNBUFFERED=1 PORT=8080 MODEL_DIR=/models
WORKDIR /app
# Install from prebuilt wheels
COPY --from=build /wheels /wheels
RUN pip install --no-cache-dir /wheels/* && rm -rf /wheels
COPY api/ ./api/
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser
EXPOSE 8080
CMD ["gunicorn", "api.main:app", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8080"]

Notes: Heavy builds happen in the first stage; the final image is slim and fast to pull.

GPU notes

For GPU training or serving, base your image on an NVIDIA CUDA runtime (for example, nvidia/cuda:12.1.0-runtime-ubuntu22.04) that matches your CUDA/cuDNN requirements. At runtime, enable the GPU device using your container runtime's GPU support. Keep CUDA/cuDNN versions aligned with your ML framework to avoid runtime errors.

Security and size optimizations

  • Use slim bases and remove build tools if not needed at runtime.
  • Combine apt-get commands and clean apt lists to reduce layers.
  • Pin package versions for reproducibility.
  • Run as non-root (USER) and limit filesystem permissions; avoid writing to the app directory.
  • Use .dockerignore to exclude .git, data, models, and local caches.
  • Keep model weights out of the image; mount at runtime or pull on start.

Exercises

Hands-on tasks that mirror real MLOps work. Build locally and run the container to see the expected output.

Exercise 1 — Training Dockerfile (CPU)

Goal: Build a training image that runs a simple script and writes a model file to /outputs.

Instructions
  1. Create files: requirements.txt (can be empty), train.py (simple script below), and Dockerfile.train.
  2. Train script (save as train.py):
import os, time, argparse
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=1)
parser.add_argument('--output', type=str, default='/outputs/model.bin')
args = parser.parse_args()
print(f"Starting training for {args.epochs} epoch(s)...")
for e in range(args.epochs):
    time.sleep(1)
    print(f"Epoch {e+1} done")
os.makedirs(os.path.dirname(args.output), exist_ok=True)
with open(args.output, 'wb') as f:
    f.write(b'FAKE_MODEL')
print('Training complete; saved model to', args.output)
  1. Write Dockerfile.train using python:3.11-slim, non-root user, and default CMD to run train.py.
  2. Build: docker build -f Dockerfile.train -t ds-train:latest .
  3. Run: docker run --rm -v $(pwd)/outputs:/outputs ds-train:latest

Expected output: Training complete; saved model to /outputs/model.bin

Exercise 2 — Serving Dockerfile (FastAPI)

Goal: Build a serving image that exposes a /health endpoint and reads model path from MODEL_DIR.

Instructions
  1. Create structure: api/main.py, requirements-serve.txt, Dockerfile.serve.
  2. requirements-serve.txt contents:
fastapi==0.110.0
uvicorn==0.25.0
gunicorn==21.2.0
  1. api/main.py contents:
import os
from fastapi import FastAPI
app = FastAPI()
MODEL_DIR = os.getenv('MODEL_DIR', '/models')
@app.get('/health')
def health():
    present = os.path.isdir(MODEL_DIR)
    return {'status':'ok','model_dir':MODEL_DIR,'present':present}
  1. Write Dockerfile.serve (similar to Example 2) with EXPOSE 8080 and non-root user.
  2. Build: docker build -f Dockerfile.serve -t ds-serve:latest .
  3. Run: docker run --rm -p 8080:8080 -e MODEL_DIR=/models -v $(pwd)/models:/models ds-serve:latest
  4. Test (from host): curl http://localhost:8080/health

Expected output: {"status":"ok","model_dir":"/models","present":true}

Checklist before you build

  • [ ] .dockerignore excludes .git, data/, outputs/, models/, __pycache__/
  • [ ] Use python:3.11-slim (or similar) and clean package caches
  • [ ] Non-root USER is set
  • [ ] Requirements are copied and installed before app code for caching
  • [ ] Training writes artifacts to a mounted volume, not the image
  • [ ] Serving reads MODEL_DIR from env and exposes correct port

Common mistakes and self-check

Mistake: baking large datasets or model weights into the image

Impact: Huge images and long pulls. Fix: Mount datasets/models at runtime or download on startup.

Mistake: placing COPY . . before installing requirements

Impact: Cache invalidation on every code change. Fix: COPY requirements first, install, then copy the rest.

Mistake: running as root

Impact: Security risk. Fix: Create a user and switch with USER before CMD.

Mistake: missing .dockerignore

Impact: Slow builds and leaked secrets. Fix: Add .dockerignore with common patterns.

Mistake: using CMD for mandatory startup behavior

Impact: Easy to override accidentally. Fix: Use ENTRYPOINT for required commands; CMD for defaults.

Self-check: Can you rebuild quickly after a code-only change? Are images under a few hundred MB for serving? Does your container start as non-root and still work?

Practical projects

  • Create a training image that logs metrics to stdout and writes a model to /outputs; wire it into a minimal CI build.
  • Build a serving image that loads the latest model from a mounted volume and provides /predict and /health endpoints.
  • Refactor both into multi-stage builds and measure image size reduction and build time improvements.

Learning path

  • Docker basics: images, containers, volumes, networks
  • Writing efficient Dockerfiles: caching, .dockerignore, non-root
  • Training images: reproducibility, artifact outputs, GPU variants
  • Serving images: lightweight bases, ports, start commands
  • Multi-stage builds and dependency wheels
  • Compose/Kubernetes runtime configs (env vars, secrets, volumes)
  • Image registries and CI/CD integration

Next steps

  • Add health and readiness endpoints to your serving app.
  • Introduce GPU support for your training image if needed.
  • Pin all dependency versions and record them at build time for reproducibility.

Mini challenge

Create a repo with two Dockerfiles (train and serve) and a docker-compose.yml that:

  • Runs training to write a model into a shared volume.
  • Starts the serving service mounting the same volume.
  • Lets you curl /health and see present=true once the model exists.

Ready to check yourself?

Take the Quick Test below. It’s available to everyone; only logged-in users get saved progress.

Practice Exercises

2 exercises to complete

Instructions

  1. Create files: requirements.txt (optional), train.py (provided), Dockerfile.train.
  2. Use python:3.11-slim, install requirements, add non-root user, set default CMD to run train.py with one epoch and output to /outputs.
  3. Build: docker build -f Dockerfile.train -t ds-train:latest .
  4. Run: docker run --rm -v $(pwd)/outputs:/outputs ds-train:latest
# train.py
import os, time, argparse
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=1)
parser.add_argument('--output', type=str, default='/outputs/model.bin')
args = parser.parse_args()
print(f"Starting training for {args.epochs} epoch(s)...")
for e in range(args.epochs):
    time.sleep(1)
    print(f"Epoch {e+1} done")
os.makedirs(os.path.dirname(args.output), exist_ok=True)
with open(args.output, 'wb') as f:
    f.write(b'FAKE_MODEL')
print('Training complete; saved model to', args.output)
Expected Output
Training complete; saved model to /outputs/model.bin

Dockerfiles For Training And Serving — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Dockerfiles For Training And Serving?

AI Assistant

Ask questions about this tool