How to learn Request Response Schemas And Validation for Model Serving APIs in Machine Learning Engineer for free

Why this matters

Clear request/response schemas and strong validation make your ML APIs reliable, safe, and easy to integrate. As a Machine Learning Engineer, you will: ship models behind HTTP endpoints, prevent bad inputs from crashing models, return predictable outputs for clients, and evolve schemas without breaking existing users.

Real tasks you will face: define JSON for prediction endpoints; validate inputs (types, ranges, enums, formats); handle errors with clear messages; version schemas; ensure backward compatibility.

Concept explained simply

A schema is a contract for your API: what fields exist, their types, constraints, and examples. Validation checks incoming requests against this contract before the model runs, and ensures responses match the documented shape.

Mental model

Think of your API like a form with a bouncer. The form (schema) lists exactly what to fill in. The bouncer (validator) checks every field. If anything is missing or malformed, the bouncer stops it at the door and explains why. Only valid inputs reach your model, and outputs are formatted before they leave.

Core parts of ML API schemas

Request schema: inputs, types, constraints (e.g., text length, image size), optional vs required, mutually exclusive fields.
Response schema: predictions, confidences, class labels, metadata (model_version, latency_ms), and error format.
Validation: type checks, ranges, regex/format (email, uuid), content limits (max bytes), custom rules (probabilities sum to 1 on response).
Compatibility: version fields, default values, deprecation strategy.

Typical status codes

200: success
400: bad request (malformed JSON, wrong content-type)
422: validation failed (well-formed JSON but invalid fields)
500: unexpected server error

Worked examples

Example 1 — Text classification endpoint

{
  "request": {
    "text": "I love this product!",
    "language": "en",
    "max_alternatives": 1
  },
  "constraints": {
    "text": {"type": "string", "minLength": 1, "maxLength": 10000},
    "language": {"enum": ["en", "es", "fr"], "default": "en"},
    "max_alternatives": {"type": "integer", "minimum": 1, "maximum": 5, "default": 1}
  },
  "response": {
    "label": "positive",
    "confidence": 0.97,
    "alternatives": [{"label": "neutral", "confidence": 0.02}],
    "model_version": "tc-1.3.2",
    "latency_ms": 35
  }
}

Notes: use 422 if text is empty or too long; respond with floats in [0,1].

Example 2 — Image inference with mutually exclusive inputs

Request accepts exactly one of: image_url OR image_base64
{
  "image_url": "https://.../cat.jpg",
  "image_base64": null,
  "top_k": 3
}
Constraints:
- image_url: string uri, optional
- image_base64: base64 string, optional
- exactly one must be provided
- top_k: integer 1..5, default 3, must be \u2264 number of classes
Response:
{
  "predictions": [
    {"label": "cat", "score": 0.92},
    {"label": "lynx", "score": 0.04},
    {"label": "dog", "score": 0.03}
  ],
  "model_version": "resnet50-2.1",
  "latency_ms": 48
}
Error example (422):
{
  "error": {
    "code": "validation_error",
    "message": "Provide exactly one of image_url or image_base64",
    "fields": {"image_url": "present", "image_base64": "present"}
  }
}

Example 3 — Tabular regression with strong typing

Request:
{
  "features": {
    "age": 42,
    "income_usd": 72000.5,
    "employment_type": "full_time",
    "zip": "94103"
  }
}
Constraints:
- age: integer 0..120
- income_usd: number 0..1e7
- employment_type: enum ["full_time", "part_time", "contract", "unemployed"]
- zip: string pattern ^\u005cd{5}$
Response:
{
  "prediction": 350000.22,
  "prediction_interval": {"low": 310000.10, "high": 390000.34},
  "model_version": "house-reg-0.9.0",
  "latency_ms": 12
}
Error example (422):
{
  "error": {
    "code": "validation_error",
    "message": "zip must match ^\\d{5}$",
    "fields": {"zip": "pattern_mismatch"}
  }
}

Practical patterns and rules

Make required minimal; everything else optional with safe defaults.
Keep responses stable; only add fields or behind a version bump.
Include model_version and latency_ms for traceability.
Return consistent error objects with code, message, and per-field details.
Limit payloads (text length, image size bytes) to protect compute.

Simple JSON Schema snippet

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "text": {"type": "string", "minLength": 1, "maxLength": 10000},
    "language": {"type": "string", "enum": ["en", "es", "fr"]},
    "max_alternatives": {"type": "integer", "minimum": 1, "maximum": 5, "default": 1}
  },
  "required": ["text"],
  "additionalProperties": false
}

Versioning and compatibility

Start with response field model_version (string) and optionally schema_version (semver).
Backward compatible changes: add optional fields, widen enums, raise max limits; avoid renames/removals.
Breaking changes: bump schema_version and expose a new route or header-based version; keep old version until clients migrate.

Security and limits

Enforce content-type (application/json) and size limits early.
Reject executable content; for base64 images, cap decoded bytes and validate mime type.
Never echo raw user inputs into logs; include request_id instead.

Who this is for

Machine Learning Engineers serving models via HTTP.
Data/ML practitioners building internal or external inference APIs.

Prerequisites

Basic HTTP and JSON.
One server framework (e.g., FastAPI, Flask, or similar).
Familiarity with your model inputs/outputs.

Learning path

Define minimal request/response for one model.
Add validation rules (types, ranges, enums, exclusivity).
Design error objects and status codes.
Introduce versioning and defaults for compatibility.
Add payload limits and performance-friendly constraints.

Exercises

Do these now. The quick test at the end reinforces these ideas. Note: anyone can take the test for free; only logged-in users will see saved progress.

Exercise 1 — Toxicity classifier schema

Create request/response schemas for a toxicity classifier that takes text and optional language. Add constraints and a clear error format.

Required: text
Optional: language in ["en","es"], default "en"
Limit: text length 1..5000
Response: label in ["toxic","non_toxic"], confidence [0,1], model_version, latency_ms

Self-check

What happens if language is "de"?
What if text is an empty string?

Exercise 2 — Image endpoint with mutual exclusivity

Design request validation that accepts exactly one of image_url or image_base64. Cap decoded image size at 5 MB. Include top_k 1..5 with default 3. Define 422 error for violations.

Self-check

What 422 message do you return if both are provided?
How do you report too-large image size?

Exercise checklist

Requests have types, ranges, and enums.
Exactly-one-of rule enforced where needed.
Responses include model_version and latency_ms.
Errors include code, message, and fields map.
No breaking change introduced without versioning.

Common mistakes and how to self-check

Too many required fields: make non-critical inputs optional with defaults. Self-check: can a minimal request succeed?
Inconsistent error shapes: define one error object and reuse. Self-check: compare errors across endpoints.
Forgetting payload limits: add max length/size. Self-check: try a huge input; ensure 413/422.
Enum drift: document allowed labels and version changes. Self-check: add a new label without breaking old clients.
Silent truncation of floats: round explicitly in responses. Self-check: inspect number formatting.

Practical projects

Wrap a sentiment model behind a /predict endpoint with full validation and error handling.
Add schema_version and a v2 route that widens an enum; keep v1 functional.
Implement size-limited image scoring (URL or base64) with top_k and latency reporting.

Next steps

Instrument your API with request_id and latency tracking.
Add input/output JSON Schema files and validate them in CI.
Document examples for success and common errors.

Mini challenge

Redesign a model response to add explanations (saliency per token) without breaking existing clients. Keep original fields stable, add a new optional explanations object, and bump schema_version if necessary.

Hint

Place explanations under an optional key, include type and version, and keep top-level prediction fields unchanged.

Check your knowledge

Ready for the quick test below? It is available to everyone for free; log in to save your progress.

Menu

Request Response Schemas And Validation

Table of Contents