luvv to helpDiscover the Best Free Online Tools
Topic 2 of 8

Schema Registry Basics

Learn Schema Registry Basics for free with explanations, exercises, and a quick test (for Data Platform Engineer).

Published: January 11, 2026 | Updated: January 11, 2026

Why this matters

On real streaming platforms, schemas define the shape of your messages so producers and consumers agree on data. A schema registry stores and versions those schemas, enforces compatibility, and prevents silent data corruption. As a Data Platform Engineer, you will: configure compatibility modes, guide teams to evolve schemas safely, debug serialization issues, and keep topics healthy during rolling deployments.

  • Ensure backward/forward compatibility during deploys
  • Standardize Avro/Protobuf/JSON Schema usage
  • Avoid breaking consumers when adding fields
  • Debug serializer/deserializer (SerDe) errors quickly

Who this is for

  • Data Platform Engineers enabling Kafka/Kinesis/Pub-Sub style platforms
  • Backend/Streaming engineers integrating producers and consumers
  • SREs supporting streaming reliability

Prerequisites

  • Basic understanding of topics, producers, consumers
  • Familiarity with one schema format (Avro, Protobuf, or JSON Schema)
  • Know what serialization/deserialization (SerDe) means

Concept explained simply

A schema registry is a catalog of message schemas with versioning and rules. Producers register schemas and write messages that reference a schema ID. Consumers use that ID to fetch the writer schema and read safely. The registry enforces compatibility so new schemas do not break existing readers or historical data.

Mental model

Think of it like an API contract library for events. Each topic has one or more subjects (e.g., topic-name-key, topic-name-value). Each subject has versions. Compatibility rules are the guardrails that let teams change contracts without breaking others.

Key components and terms

  • Schema formats: Avro, Protobuf, JSON Schema
  • Subject: a named stream of schema versions (often bound to a topic's key or value)
  • Compatibility modes: NONE, BACKWARD, FORWARD, FULL (+ TRANSITIVE variants)
  • Wire format: messages carry a small schema identifier so consumers can fetch the correct writer schema
  • SerDes: serializers/deserializers that talk to the registry
Compatibility in one breath
  • BACKWARD: new readers (new schema) can read old data
  • FORWARD: old readers (old schema) can read new data
  • FULL: both backward and forward with the latest
  • TRANSITIVE: apply the rule across all previous versions, not just the latest

Worked examples

Example 1 — Add an optional field (safe with BACKWARD/FORWARD depending on deploy order)

Old Avro schema:

{
  "type": "record",
  "name": "User",
  "namespace": "demo.v1",
  "fields": [
    {"name": "id", "type": "long"},
    {"name": "name", "type": "string"}
  ]
}

New Avro schema (adds optional email with default null):

{
  "type": "record",
  "name": "User",
  "namespace": "demo.v1",
  "fields": [
    {"name": "id", "type": "long"},
    {"name": "name", "type": "string"},
    {"name": "email", "type": ["null","string"], "default": null}
  ]
}
  • Deploy consumers first? Use BACKWARD compatibility. New readers handle old data.
  • Deploy producers first? Use FORWARD so old readers can ignore the new field.
Example 2 — Rename a field without breaking consumers

You want to rename name to full_name. Do not simply change the field name. Use aliases:

{
  "type": "record",
  "name": "User",
  "namespace": "demo.v1",
  "fields": [
    {"name": "id", "type": "long"},
    {"name": "full_name", "type": "string", "aliases": ["name"]}
  ]
}

Readers using the old field name will still find the data. After all producers and readers migrate, you can remove the alias in a later version (with care if using transitive rules).

Example 3 — Subject naming strategies
  • topic-name-key / topic-name-value: separate subjects for keys and values per topic; common default
  • record-name: subject per record type; useful when the same record appears in multiple topics
  • topic-record-name: combines both

Pick the one that fits your reuse patterns and governance model. topic-name-* is simplest to start with.

Step-by-step: Register and evolve a schema

  1. Model the record: write Avro/Protobuf/JSON Schema capturing required vs optional fields.
  2. Choose subject: usually topic-name-value for message values.
  3. Set compatibility: start with BACKWARD or FULL; use TRANSITIVE if you need checks across history.
  4. Register v1: producers start using the schema; messages include the schema ID in the wire format.
  5. Evolve safely: when adding fields, provide defaults or make them optional; for renames, use aliases (Avro) or keep field numbers stable (Protobuf).
  6. Deploy in order: consumer-first favors BACKWARD; producer-first favors FORWARD; FULL covers both (latest).
  7. Monitor: watch registry validations and SerDe errors in logs.

Exercises

These exercises mirror the tasks below. Try them here, then open the solutions only if needed.

Exercise 1 — Pick the right compatibility mode

Scenario: Many consumers still use the previous schema. You must deploy producers first to add an optional field with a safe default. Which compatibility mode should the subject use now?

Type your answer, then compare with the solution in the Exercises section below.

Exercise 2 — Evolve an Avro schema

Given v1:

{
  "type": "record",
  "name": "Order",
  "namespace": "shop.v1",
  "fields": [
    {"name": "order_id", "type": "string"},
    {"name": "amount", "type": "double"}
  ]
}

Create v2 that:

  • Adds optional currency with default "USD"
  • Renames amount to total using aliases

Write the full v2 schema, then compare with the solution in the Exercises section below.

  • I can explain BACKWARD vs FORWARD vs FULL to a teammate
  • I know how to add a field without breaking old consumers
  • I can safely rename a field (aliases or stable field numbers)
  • I understand subject naming strategies

Common mistakes and self-check

  • Changing a field's type without a migration path. Fix: use unions (Avro), new fields, or transformation steps.
  • Removing required fields immediately. Fix: deprecate first, stop producing the field, then remove after readers migrate.
  • No defaults for new fields. Fix: provide sensible defaults or make fields nullable.
  • Confusing deploy order and compatibility. Fix: consumer-first -> BACKWARD; producer-first -> FORWARD; uncertainty -> FULL.
  • Reusing Protobuf field numbers. Fix: never reuse; reserve/remove carefully.
Self-check prompt

Given your last schema change, could the oldest active consumer still read messages after your deploy? If not, which compatibility or change would fix it?

Practical projects

  • Single-topic evolution: Start with a simple User schema, register v1, then add email and rename name to full_name using aliases. Validate compatibility at each step.
  • Cross-topic reuse: Use record-name strategy for a shared Address record consumed by two services. Evolve it by adding an optional field with default.
  • Producer-first rollout: Simulate old consumers reading new messages. Configure FORWARD compatibility and verify old readers keep working.

Mini challenge

You need to drop a field purchase_note that only 10% of consumers still read. Design a two-release plan that avoids breakage. Mention compatibility mode(s), producer/consumer deploy order, and when to actually remove the field.

Learning path

  • Start: Understand formats (Avro/Protobuf/JSON Schema) and registry concepts
  • Next: Practice compatibility by evolving schemas with defaults and aliases
  • Then: Choose subject naming strategies for your org
  • Finally: Automate validations in CI and monitor registry/SerDe errors

Next steps

  • Do the hands-on exercises below
  • Take the Quick Test at the end of this page
  • Apply these patterns to a real topic in your environment (in a dev cluster)

Quick Test

Anyone can take the test. Only logged-in users have their progress saved.

Practice Exercises

2 exercises to complete

Instructions

Scenario: You must deploy a new producer that writes an extra optional field with a safe default. Many existing consumers still run the old schema and cannot be updated yet. Which single compatibility mode should you set on the subject to avoid breaking those old consumers?

Answer with one of: NONE, BACKWARD, FORWARD, FULL. Explain briefly why.

Expected Output
FORWARD (ensures old readers can read new data).

Schema Registry Basics — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Schema Registry Basics?

AI Assistant

Ask questions about this tool