luvv to helpDiscover the Best Free Online Tools
Topic 6 of 9

Packaging And Publishing Artifacts

Learn Packaging And Publishing Artifacts for free with explanations, exercises, and a quick test (for Machine Learning Engineer).

Published: January 1, 2026 | Updated: January 1, 2026

Why this matters

As a Machine Learning Engineer, your models, code, and data transformations must move reliably from development to production. Packaging and publishing artifacts lets teams:

  • Reproduce exact training and inference environments
  • Promote tested builds across dev → staging → production
  • Roll back safely when something breaks
  • Share components (models, features, pipelines) across teams

Typical on-the-job tasks:

  • Build a Python wheel for a feature library and publish it to an internal package index
  • Bundle a trained model with metadata and checksums, then push to a model registry or artifact repository
  • Create a Docker image for inference and tag it with version and git SHA
  • Automate promotion of signed, scanned artifacts through environments
Note on progress

You can take the quick test without logging in. Only logged-in users will have their progress saved.

Who this is for

  • Machine Learning Engineers and Data Scientists moving from notebooks to production
  • DevOps/Platform Engineers supporting ML services
  • Anyone building reproducible ML pipelines

Prerequisites

  • Basic Python packaging (setup.cfg/pyproject.toml) and virtual environments
  • Familiarity with Docker images and tags
  • Comfort with Git and semantic versioning (e.g., 1.4.2)
  • Knowing what an artifact repository or container registry is

Concept explained simply

An artifact is a packaged, versioned output you can store and reuse: a Python wheel, a Docker image, a model file (.pt/.pkl/.onnx), or a dataset snapshot. In CI/CD, you build artifacts once, test them, sign/scan them, then publish them to a repository. Deployments pull exactly those versions so what you tested is what you run.

Mental model

Think of artifacts as sealed boxes with labels:

  • The box: a wheel, a container image, or a model bundle
  • Labels: version, commit SHA, build time, metadata (framework, metrics)
  • Seals: checksum/signature to ensure integrity
  • Warehouse: artifact repository or model registry

CI builds the box, applies labels and seals, and puts it in the warehouse. Deployments only take boxes from the warehouse, not from someone’s laptop.

Core principles

  • Immutability: once published, an artifact with a tag/version never changes
  • Determinism: the same source and config should produce the same artifact
  • Traceability: every artifact links to its Git commit, build logs, and tests
  • Promotion: move the exact artifact across environments
  • Security: scan, sign, and verify before releasing

Worked examples

Example 1: Package a Python feature library as a wheel

  1. Create pyproject.toml and a src layout
  2. Build wheel and publish to an internal index
# pyproject.toml (minimal sample)
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "featurekit"
version = "0.1.3"
description = "Reusable feature transforms"
authors = [{name = "Your Team"}]
readme = "README.md"
requires-python = ">=3.9"
dependencies = ["pandas>=2.0", "numpy>=1.24"]
# Build
python -m build --wheel
# Publish (example command; configure your repository URL and token in env)
twine upload --repository-url $PYPI_URL -u $USER -p $TOKEN dist/*

Result: an immutable wheel like featurekit-0.1.3-py3-none-any.whl is available for pipelines to install with pip.

Example 2: Bundle a trained model with metadata and checksum

  1. Save model and attach metadata in a manifest
  2. Compute a checksum for integrity
  3. Push the bundle to an artifact repository/model registry
# directory structure
model_bundle/
  model.onnx
  manifest.json
  metrics.json
  sha256.txt
# manifest.json (sample)
{
  "name": "churn-model",
  "version": "1.5.0",
  "git_sha": "<commit>",
  "framework": "onnx-1.15",
  "python": "3.10",
  "train_time": "2025-09-14T10:20:00Z",
  "features": ["tenure", "monthly_charges", "contract_type"],
  "intended_use": "batch_inference",
  "notes": "calibrated with temperature scaling"
}
# checksum
printf "$(sha256sum model_bundle/model.onnx)\n" > model_bundle/sha256.txt

Store model_bundle as a single archive (e.g., churn-model-1.5.0.tgz). Your CI job uploads it to an artifact store. Downstream jobs verify the checksum before deploying.

Example 3: Build and tag an inference Docker image

# Dockerfile (minimal)
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
ENV PORT=8080
CMD ["python", "serve.py"]
# Build and tag with version and commit
VERSION=1.5.0
GIT_SHA=$(git rev-parse --short HEAD)
IMAGE="registry.example.com/ml/churn-infer:${VERSION}-${GIT_SHA}"
docker build -t "$IMAGE" .
# Optional: sign/scan steps go here
# Push
docker push "$IMAGE"

Downstream environments deploy the exact tag ${VERSION}-${GIT_SHA} to guarantee traceability.

Example 4: Promote the exact artifact

Promotion should not rebuild. Instead, retag the already-pushed image (or mark the model version) after staging tests pass:

# Retag for production without rebuilding
docker pull "$IMAGE"
docker tag "$IMAGE" "registry.example.com/ml/churn-infer:${VERSION}-prod"
docker push "registry.example.com/ml/churn-infer:${VERSION}-prod"
Example 5: SBOM and provenance (optional)

Generate a software bill of materials (SBOM) and attach it as an artifact. Store build provenance (who built, when, from which commit). These boost compliance and debugging.

Minimum viable artifact pipeline (step-by-step)

  1. Build artifacts once: wheel, model bundle, container image
  2. Attach metadata: version, git SHA, build time, metrics
  3. Verify: run tests; compute checksum; optionally scan/sign
  4. Publish: push to artifact and container registries
  5. Promote: retag or mark versions across environments
  6. Deploy: downstream pulls by exact version/tag only

Common mistakes and self-check

  • Mistake: Rebuilding during promotion. Fix: Promote by retagging or marking an existing artifact only.
  • Mistake: Floating tags like latest. Fix: Require immutable tags (version+git SHA).
  • Mistake: Missing metadata. Fix: Enforce manifest fields in CI.
  • Mistake: No integrity check. Fix: Store and verify checksums/signatures.
  • Mistake: Hidden dependencies. Fix: lock files or explicit versions; include runtime system deps in the image.

Self-check:

  • Can you trace a production artifact back to its commit and tests?
  • If staging passes, can you promote to production without rebuilding?
  • Can a teammate reproduce the artifact locally using the manifest?

Exercises

Try these hands-on tasks. The same exercises are listed below with solutions and expected outputs.

Exercise 1: Build and verify a Python wheel

  1. Create a minimal package (src layout) with pyproject.toml
  2. Build the wheel
  3. Install it in a fresh venv to verify import
# expected: a file like dist/featurekit-0.1.0-py3-none-any.whl
Hints
  • Use python -m build --wheel
  • Use python -m venv .venv and pip install dist/*.whl
Show solution

See the Exercise 1 solution in the Exercises section below.

Exercise 2: Build and tag a Docker image with version+git SHA

  1. Set VERSION and GIT_SHA variables
  2. Build, tag, and run docker image locally
  3. List images and confirm the tag includes both
# expected: an image like registry.local/app:0.1.0-a1b2c3d
Hints
  • Use git rev-parse --short HEAD
  • docker build -t <image:tag> . then docker images
Show solution

See the Exercise 2 solution in the Exercises section below.

Checklist: good artifacts before publish

  • [ ] Version and git SHA embedded in name or labels
  • [ ] Manifest with framework, Python version, and intended use
  • [ ] Tests passed and results archived
  • [ ] Checksum/signature generated and stored
  • [ ] Image/package scanned (if available)
  • [ ] Publication to registry completes with immutable tags

Practical projects

  • Project A: Create a reusable feature library as a wheel and deploy it in two pipelines (training and inference)
  • Project B: Train a model, bundle with manifest and metrics, and publish; write a small script to verify checksum and load the model
  • Project C: Build an inference image with health endpoint; implement retag-based promotion from staging to production

Learning path

  • Now: Packaging and publishing artifacts (this page)
  • Next: Environment promotion and release strategies (blue/green, canary)
  • Then: Continuous monitoring, rollbacks, and incident response
  • Security: Scanning, signing, and SBOM generation

Next steps

  • Automate the build-and-publish flow in your CI
  • Enforce immutable tags and metadata checks
  • Add integrity checks and optional signing before promotion

Mini challenge

Within your current project, pick one artifact (wheel, model bundle, or Docker image). Add missing metadata, embed git SHA in its tag, generate a checksum, and publish it once. Demonstrate promotion to the next environment without rebuilding.

Quick test

Take the test below to confirm understanding. Anyone can take it; saved progress is available to logged-in users.

Practice Exercises

2 exercises to complete

Instructions

  1. Create a directory featurekit with src/featurekit/__init__.py containing a simple function, e.g., add(a, b)
  2. Add pyproject.toml specifying name=featurekit and version=0.1.0
  3. Run python -m build --wheel to build
  4. Create a fresh virtual environment, install the wheel, and import featurekit to call add(2,3)
# sample src/featurekit/__init__.py
def add(a, b):
    return a + b
Expected Output
A wheel file dist/featurekit-0.1.0-py3-none-any.whl and a successful import with add(2,3) == 5.

Packaging And Publishing Artifacts — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Packaging And Publishing Artifacts?

AI Assistant

Ask questions about this tool