Why this matters
Mental model
Think of your project as a graph:
- Nodes: code commits, dataset snapshots, trained models, metrics.
- Edges: how each model was created (code + data + params) and evaluated (metrics).
Versioning adds stable labels (like tags) to important nodes, plus metadata that records edges (lineage).
Core building blocks
- Immutable snapshots: do not change old versions. Create a new one with a new ID.
- Unique IDs: use commit hashes for code, content hashes or snapshot IDs for data, and a model version like MAJOR.MINOR.PATCH.
- Manifests: a text file that lists exact file hashes/paths for a dataset version and key metadata (e.g., class map, split seeds).
- Configs: store parameters in versioned files (e.g., YAML/JSON) and reference them in training logs.
- Artifacts: package trained weights, label maps, pre/post-processing code, and metrics together.
- Lineage: automatically record which data + code + params produced each model.
What to version in vision projects
- Raw images and videos (or pointers to them).
- Annotations: boxes, masks, keypoints, and their schema.
- Data splits: train/val/test indexes and seeds.
- Preprocessing: resizing, normalization, augmentations.
- Training configs and random seeds.
- Trained artifacts: model weights, label encoder, thresholds.
- Evaluation outputs: metrics and per-class breakdowns.
Worked examples
Example 1: Dataset versioning for object detection
- Create a manifest for DS-1.0.0 with:
- images/ and labels/ file hashes
- classes: [person, cart, shelf]
- split seed: 42, train: 70%, val: 20%, test: 10%
- notes: initial retail dataset from 5 stores
- Update labels and add new images; produce DS-1.1.0 (minor: backward compatible changes).
- Discover 80 mislabeled boxes fixed; create DS-1.1.1 (patch: fixes only).
Now you can train and compare models on DS-1.0.0 vs DS-1.1.1 to see the effect of label quality.
Example 2: Model versioning with semantic versions
- v1.4.2: trained on DS-1.1.0 with config C-23b; AP50=0.71.
- v1.5.0: major architecture change; trained on DS-1.1.1; AP50=0.75.
Rule of thumb:
- MAJOR (X.y.z): architecture or output format changes.
- MINOR (x.Y.z): architecture same but behavior improved (e.g., better data or hyperparams).
- PATCH (x.y.Z): bug fixes; no expected accuracy shift.
Release note for v1.5.0 should include dataset version, commit hash, config ID, metrics, and known caveats.
Example 3: Reproducible training run
- Pin code to commit 9f3a1c4.
- Pin dataset to DS-1.1.1 and splits to seed 42.
- Pin config to C-23c with lr=0.001, epochs=50, seed=2024.
- Train and record:
run_id=2024-08-15T10-30Z code=9f3a1c4 data=DS-1.1.1 config=C-23c metrics: AP50=0.752, AP75=0.612, latency=23ms artifacts: model.pt (sha256:...), labelmap.json (sha256:...) - Anyone can reconstruct the run by checking out the commit, pulling dataset DS-1.1.1, and using config C-23c.
Choosing tools (vendor-neutral)
- Data snapshots and manifests: use any content-addressed storage or dataset tool that can store hashes and metadata.
- Model registry: any system that stores artifacts, metadata, and version tags.
- Experiment tracking: any tool that logs code commit, data version, params, metrics, and artifacts.
- For small teams: Git + large-file storage + a simple manifest can be enough to start.
Step-by-step: set up versioning in a new vision repo
- Create repo structure:
data/ raw/ labels/ manifests/ configs/ models/ src/ - Define dataset manifest fields: snapshot_id, class_map, file list with hashes, split seed, notes.
- Create DS-0.1.0 manifest and freeze it (read-only).
- Add a training config C-01 (lr, batch, image size, augmentations, seed).
- Write a small script that logs: code commit, data snapshot_id, config ID, metrics, and artifact hashes.
- Train a baseline model; tag it v0.1.0 and store artifacts + metrics.
- Change only one variable at a time (e.g., dataset from DS-0.1.0 to DS-0.2.0) to keep comparisons clear.
- Automate: add a pre-train check that refuses to start if code/data/config are unpinned.
Quick checklist before every training run
- Code is on a clean commit (no untracked changes).
- Dataset snapshot_id is selected and manifests exist.
- Config file is committed and referenced by ID.
- Random seeds fixed for training and splits.
- Expected output path and model version are reserved.
Exercises
Note: Everyone can take the exercises and quick test. Sign in to save your progress.
Exercise 1 (ex1): Design a versioning plan
Create a versioning plan for a semantic segmentation project with 3 classes. Include dataset versioning, model versioning rules, and what metadata you will store per version.
What to hand in
- Dataset version scheme and first two versions.
- Model version scheme and first baseline version.
- A short manifest example with key fields.
Exercise 2 (ex2): Reproduce a model from tags
Given: code commit a1b2c3d, dataset DS-1.2.0, config C-17, expected AP50=0.68. Describe steps to reproduce and how you will verify that your result matches.
Self-check checklist
- Did you pin code to a commit?
- Did you specify exact dataset snapshot and split seed?
- Did you use the same config and random seed?
- Did you verify artifact hashes and metrics tolerances?
Common mistakes and how to self-check
- Overwriting datasets in place. Fix: make a new snapshot ID; never edit old versions.
- Untracked config tweaks. Fix: keep configs in version control and reference by ID.
- Missing split reproducibility. Fix: log split seed and exact indexes.
- Artifacts without metadata. Fix: bundle weights with label map, pre/post-processing, and versions.
- Comparing apples to oranges. Fix: change one factor at a time and record it.
Self-audit in 60 seconds
- Can you rebuild last week’s best model without guessing?
- Can you list which dataset snapshot and config produced your current production model?
- If a label is fixed today, will it create a new dataset version?
Practical projects
- Build a small object detection pipeline with two dataset snapshots and compare models across them using a consistent registry.
- Create a labeling correction workflow that produces patch dataset versions and a changelog.
- Automate a training preflight script that blocks runs unless code/data/config are pinned.
Who this is for
- Computer Vision Engineers bringing models to production.
- Data Scientists who need reproducible experiments and audits.
- MLOps/ML Engineers setting up pipelines and registries.
Prerequisites
- Basic Git usage and branching.
- Familiarity with model training loops and configs.
- Understanding of your task type (classification, detection, segmentation).
Learning path
- Start with immutable dataset snapshots and manifests.
- Add experiment tracking for code, data, config, metrics, and artifacts.
- Introduce a model registry with semantic versioning.
- Automate preflight checks and CI validations.
Next steps
- Run the exercises to design and test your plan.
- Take the quick test to validate understanding.
- Apply versioning to your current vision project this week.
Mini challenge
Your production detector regressed on night-time images after a hotfix. Outline the 5 fastest steps to identify whether code, data, or parameters changed—and how you would roll back safely with versioned artifacts.