Who this is for
Machine Learning Engineers and Data Scientists moving code from notebooks to production. If you hand code to others (data platform, backend, DevOps) or deploy batch/online ML, this is for you.
Prerequisites
- Comfortable with Python basics (functions, modules, virtual environments).
- Basic familiarity with pandas, scikit-learn, and reading/writing files.
- Know how to run a script from the command line.
Why this matters
- Fewer bugs and outages: clear names, types, and tests make failures obvious.
- Faster reviews: consistent style means reviewers focus on logic, not formatting.
- Reproducibility: deterministic, well-logged code is easier to rerun and debug.
- Handoffs: platform and backend teams can integrate your code without guessing intent.
Real tasks where this shows up
- Turning a notebook feature engineering block into a reusable module used by training and inference.
- Writing a batch inference script that ops runs nightly with logs and clear exit codes.
- Fixing a production bug quickly because logs, types, and small functions isolate the issue.
Concept explained simply
Production code style is the set of readable, repeatable rules that make your Python code predictable for humans and safe for systems. Think of it as traffic rules for your project: consistent lanes (formatting), clear signs (names, docstrings), seatbelts (types, tests), and a dashboard (logging).
Mental model: Code should be easy to read first, then easy to change, and finally easy to run. If someone new can guess what a function does without running it, you’re doing it right.
Core guidelines to follow
Imports at top, one purpose per module, short functions. Keep data IO and business logic separated.
Descriptive, consistent: lower_snake_case for functions/variables, UpperCamelCase for classes, UPPER_SNAKE_CASE for constants.
Docstrings explain intent, inputs, outputs, and errors. Keep them close to code. Prefer Google/NumPy style consistently.
Type hints for function boundaries and tricky variables. They communicate contracts and prevent common mistakes.
Use logging instead of print. Include context (counts, paths, parameters). Choose sensible levels: DEBUG, INFO, WARNING, ERROR.
Fail fast with clear exceptions. Validate inputs at boundaries.
Auto-format (e.g., Black) and lint (e.g., Ruff). Keep imports ordered and unused code out.
Worked examples
Example 1 — Refactor a messy feature function
Before:
def prep(df):
df=df.copy()
df["age"]=df["age"].fillna(0)
df['country']=df['country'].fillna('UNK')
from sklearn.preprocessing import StandardScaler
s=StandardScaler()
df['x_norm'] = s.fit_transform(df[['x']])
return df
After (production-style):
from __future__ import annotations
import logging
from typing import Tuple
import pandas as pd
from sklearn.preprocessing import StandardScaler
logger = logging.getLogger(__name__)
def prepare_features(
df: pd.DataFrame,
*,
scaler: StandardScaler | None = None,
) -> tuple[pd.DataFrame, StandardScaler]:
"""
Clean and transform features for model input.
Args:
df: DataFrame with columns ['age', 'country', 'x'].
scaler: Optional fitted StandardScaler to reuse.
Returns:
(df_out, scaler): df_out includes 'x_norm'.
Raises:
KeyError: if required columns are missing.
"""
required = {"age", "country", "x"}
missing = required.difference(df.columns)
if missing:
raise KeyError(f"Missing required columns: {sorted(missing)}")
out = df.copy()
out["age"] = out["age"].fillna(0)
out["country"] = out["country"].fillna("UNK")
sc = scaler or StandardScaler()
out["x_norm"] = (
sc.fit_transform(out[["x"]]) if scaler is None else sc.transform(out[["x"]])
)
logger.debug("Prepared %d rows", len(out))
return out, sc
- Imports at top
- Type hints and docstring
- Input validation and logging
- Pure transformation (no prints, no hidden globals)
Example 2 — CLI predict script with logging
from __future__ import annotations
import argparse
import logging
import sys
from pathlib import Path
import pandas as pd
from joblib import load
LOGGER_NAME = "predict_cli"
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Batch predict")
parser.add_argument("--model", type=Path, required=True)
parser.add_argument("--data", type=Path, required=True)
parser.add_argument("--out", type=Path, required=True)
parser.add_argument("--log-level", default="INFO", choices=["DEBUG","INFO","WARNING","ERROR"])
return parser.parse_args()
def configure_logging(level: str) -> None:
logging.basicConfig(
level=getattr(logging, level),
format="%(asctime)s %(levelname)s %(name)s — %(message)s",
)
def main() -> int:
args = parse_args()
configure_logging(args.log_level)
logger = logging.getLogger(LOGGER_NAME)
logger.info("Loading model: %s", args.model)
model = load(args.model)
logger.info("Reading data: %s", args.data)
df = pd.read_csv(args.data)
preds = model.predict(df)
args.out.write_text("\n".join(map(str, preds)))
logger.info("Wrote %d predictions to %s", len(preds), args.out)
return 0
if __name__ == "__main__":
sys.exit(main())
- Clear entrypoint and exit code
- Configurable logging level
- Path-safe IO and no prints
Example 3 — Lightweight project layout and imports
my_project/
pyproject.toml
README.md
src/
my_project/
__init__.py
features.py
predict.py
tests/
test_features.py
"""Feature engineering utilities."""
from __future__ import annotations
from dataclasses import dataclass
from typing import Iterable
import numpy as np
import pandas as pd
@dataclass(frozen=True)
class Binner:
bins: list[float]
def transform(self, x: pd.Series) -> pd.Series:
"""Bucketize a numeric series into integer bins."""
return pd.cut(x, bins=self.bins, labels=False, include_lowest=True)
def to_float(s: pd.Series) -> pd.Series:
"""Convert a series to float, coercing errors to NaN."""
return pd.to_numeric(s, errors="coerce").astype(float)
- Imports grouped: stdlib, third-party, local
- Short, focused modules
- Docstrings on public API
Exercises
Do these locally or in a notebook cell. Then compare with solutions below. Use the checklist to self-review.
Exercise 1 (matches ex1)
Refactor a function to production quality:
# Given
import pandas as pd
def bad_fn(df):
if 'x' not in df: print('no x!')
df['y']=df['x']*2
return df
# Task:
# - Rename to something descriptive
# - Add type hints and a docstring
# - Validate input and raise a clear exception if 'x' missing
# - Make a copy, avoid mutating caller data
# - Add logging at DEBUG level with row count
Peek a hint
- Use logging.getLogger(__name__) to create a module logger.
- Return the transformed DataFrame; avoid prints.
Exercise 2 (matches ex2)
Create a minimal CLI that reads a CSV, selects a column, and writes its mean to a text file with logging and proper exit code.
# Requirements
# - argparse for --data, --column, --out, --log-level
# - logging with a basicConfig format including level and name
# - Validate that the column exists; raise SystemExit(2) on error
# - Write a single float to the output path
Checklist (tick as you finish)
- Imports at top, grouped by standard/third-party/local
- Descriptive names and constants for magic values
- Docstrings on public functions
- Type hints on function signatures
- Logging instead of print
- Clear exceptions and validation
- No hidden side effects; functions return values
Common mistakes and self-check
- Prints in library code. Self-check: “Could a scheduled job parse these messages?” Fix: use logging with levels.
- Mutating input DataFrames. Self-check: “Does caller’s df change?” Fix: always copy when transforming.
- Long, mixed-responsibility functions. Self-check: “Can I name this in 5 words?” Fix: split into smaller functions.
- Hidden imports inside functions. Self-check: “Are imports at top?” Fix: move to module top for speed and clarity.
- Ambiguous names. Self-check: “Could a new teammate guess the purpose?” Fix: rename to business vocabulary.
- Silent failures. Self-check: “Do I raise on invalid input?” Fix: validate and raise clear exceptions.
Practical projects
- Turn your notebook feature block into a reusable module with docstrings, types, and unit tests.
- Write a batch inference CLI with logging and exit codes that your scheduler can run nightly.
- Create a small "utils" package inside src/ with clear API and import order, plus tests/ for it.
Learning path
- Before this: Python functions and modules, pandas basics.
- Now: Production style (this lesson) — focus on readability, logging, and deterministic functions.
- Next: Packaging, tests, and CI basics; config management; data and model versioning.
Quick Test
Take the short test below to check your understanding. Everyone can take it for free. If you log in, your score and progress will be saved.
Mini challenge
Pick one of your recent notebooks. Extract two reusable functions into a module with types, docstrings, and logging. Replace the notebook cells with calls to your new module. Timebox to 45 minutes.
Optional stretch goal
- Add a CLI that runs the notebook’s preprocessing and writes a features.parquet file.
- Write one unit test for each extracted function.
Next steps
- Finish Exercises 1–2 and ensure every checklist item is ticked.
- Take the Quick Test to confirm mastery, then move to the next subskill.