Why this matters
Analytics Engineers are often surprised by silent upstream changes: a column disappears, enums expand, or timestamps arrive late. Data contracts prevent surprises by agreeing on what data will look like, how reliable it will be, and what happens when change is needed. With contracts, you can:
- Stop schema-breaking deploys before they hit production.
- Set clear freshness and completeness expectations with producers.
- Automate monitors that map directly to those expectations.
- Reduce incident time-to-detect and time-to-recover.
Concept explained simply
A data contract is a handshake between data producers and data consumers. It contains three things:
- The spec: schema, definitions, acceptable values, and SLAs (freshness, volume, uniqueness).
- The monitors: tests and alerts that verify the spec continuously.
- The change process: versioning rules, deprecation windows, and who to contact.
Mental model
Think of it like a service-level agreement for data. The producer promises: "We will deliver this shape of data, at this cadence, with these quality guarantees." The consumer promises: "We will test and alert, give feedback early, and follow the agreed change process."
Definitions cheat sheet
- Freshness SLA: Maximum allowed delay between event occurrence and availability in analytics.
- Completeness SLA: Minimum percentage of expected rows delivered in a time window.
- Schema stability: Rules for adding, updating, or removing fields.
- Semantics: What a field means in the business context (not just its type).
- Breaking change: Any change that would break downstream logic or assumptions.
Core components of a data contract
- Scope & ownership: Producer team, consumer teams, business purpose, single owner contact.
- Entities & fields: Name, type, description, nullability, constraints, PII classification, enumerations.
- Quality SLAs: Freshness, completeness, uniqueness, allowed duplicates, volume bands, referential integrity.
- Backfills & historical corrections: If/when backfills occur and how they are communicated.
- Versioning & change management: SemVer, deprecation periods, migration windows, change notice timelines.
- Monitoring & alerts: Tests mapped to each SLA, severity levels, run frequency, alert routing.
- Incident handling: How to raise, triage, rollback, and communicate status.
Worked examples
Example 1: Product signup event
entity: user_signup_event
owner: growth-eng@company
purpose: Measure signup funnel and activation
schema:
- name: event_id
type: string
constraints: [not_null, unique]
- name: user_id
type: string
constraints: [not_null]
- name: occurred_at
type: timestamp
constraints: [not_null]
- name: signup_method
type: string
allowed_values: [email, google, apple]
- name: country
type: string
nullable: true
quality_sla:
freshness_minutes: 15
completeness_last_24h_pct: >= 99
duplicate_rate_pct: <= 0.1
change_mgmt:
version: 1.2.0
add_field: allowed with 7-day notice
breaking_change: 30-day deprecation + parallel fields
monitoring:
tests: [not_null(event_id), unique(event_id), enum(signup_method), freshness(15m)]
severity: {freshness: high, uniqueness: high, enum: medium}
alerts: #growth-eng-slack, oncall pagerNotes: adding a new signup_method requires notice; a missing event_id triggers a high-severity alert.
Example 2: Payments table
entity: payments
owner: billing-platform@company
purpose: Revenue reporting and refunds
schema:
- name: payment_id
type: string
constraints: [primary_key]
- name: order_id
type: string
constraints: [not_null, foreign_key(orders.order_id)]
- name: amount_cents
type: integer
constraints: [not_null, >= 0]
- name: currency
type: string
allowed_values: [USD, EUR, GBP]
- name: status
type: string
allowed_values: [pending, captured, refunded, failed]
- name: processed_at
type: timestamp
constraints: [not_null]
quality_sla:
freshness_minutes: 30
completeness_last_hour_pct: >= 99.5
referential_integrity_errors_per_day: 0
change_mgmt:
version: 2.0.0
breaking_change: requires ADR and 45-day window
monitoring:
tests: [pk_unique(payment_id), not_null(order_id), fk_orders, enum(status), freshness(30m), volume(0.5x..2x 7d avg)]Notes: A volume spike test guards against duplicate ingestion or missing batches.
Example 3: Vendor CSV drop
entity: ad_spend_daily
owner: marketing-ops@company
source: s3://vendor-bucket/spend/YYYY-MM-DD.csv
schema:
- date: date not_null
- channel: string enum[search, social, display]
- spend_usd: numeric >= 0
quality_sla:
delivery_deadline_utc: 06:00
backfill_policy: vendor may correct last 7 days; notification required
monitoring:
tests: [freshness(before 06:15 UTC), enum(channel), spend_nonnegative, completeness(daily row per channel)]Notes: Delivery time is the key SLA in file-based feeds.
Tracking and monitoring
- Translate each SLA to a test: freshness check, volume band, uniqueness, not_null, enum/regex, referential integrity.
- Automate tests in your pipeline tool (for example: warehouse SQL tests, dbt tests, Great Expectations, Soda). Pick one and standardize.
- Classify severities: high (break pipelines), medium (alert), low (log only).
- Add pre-production gates: validate samples in staging; block deploys on failing contract tests.
- Route alerts to owners with clear runbooks: when to rollback, re-run, or escalate.
Sample SQL monitors
-- Uniqueness
select payment_id from payments
group by 1 having count(*) > 1;
-- Freshness (minutes since last row)
select extract(epoch from (now() - max(processed_at)))/60 as freshness_min from payments;
-- Enum guard
select status from payments
where status not in ('pending','captured','refunded','failed')
limit 1;Step-by-step: create a contract
- Identify critical questions. What decisions rely on this dataset? What breaks if shape or timeliness changes?
- Draft the spec. List fields, definitions, types, nullability, enums, constraints. Propose freshness and completeness SLAs.
- Negotiate with producers. Confirm feasibility, choose owners, and agree on change windows.
- Implement tests. Map each SLA to a monitor. Add pre-prod checks.
- Alerting & runbooks. Define severity thresholds and who responds.
- Versioning. Use semantic versioning; schedule deprecations; document migrations.
- Review cadence. Quarterly review SLAs and incidents; adjust as needed.
Mini task: pick SLAs fast
Start with defaults: freshness 30m, completeness 99% daily, uniqueness per primary key, allowed enums documented. Adjust after a week of observations.
Exercises (practice)
Do these in your notes or editor. Solutions are provided below each exercise.
Exercise 1 — Draft a contract for an orders table
ID: ex1
You ingest an orders table used by Finance and BI. Define a contract with: purpose, owner, schema (at least: order_id, user_id, created_at, status, total_cents, currency), SLAs (freshness, completeness, uniqueness), change policy, and monitoring tests.
Checklist
- Owner and purpose included
- Primary key uniqueness
- Enum for status
- Currency allowed values
- Freshness SLA
- Completeness metric
- Change management rules
- Tests mapped to SLAs
Exercise 2 — Write monitors for the contract
ID: ex2
Based on your orders contract, write SQL (or pseudo-config) for: uniqueness on order_id, enum guard on status, freshness check (max allowed 20 minutes), and a 7-day volume band test (0.6x–1.6x of trailing average).
Checklist
- Uniqueness query returns zero rows on success
- Enum guard catches any unexpected status
- Freshness expressed in minutes
- Volume band compares today vs. trailing 7 days
Common mistakes (and self-check)
- Only schema, no semantics. Self-check: does each field have a clear business definition?
- Too strict SLAs. Self-check: do SLAs reflect realistic producer capabilities?
- No owner or pager. Self-check: can someone respond to alerts within minutes?
- Ignoring backfills. Self-check: do you document how historical corrections are handled?
- Alert fatigue. Self-check: are low-severity issues logged but not paged?
- No versioning policy. Self-check: do breaking changes have a deprecation window?
Practical projects
- Create a contract for your top-3 critical datasets and implement monitors for each SLA.
- Set up a pre-production contract test that blocks deploys when enums are violated.
- Simulate a breaking change (rename a column) and walk through your deprecation plan.
- Build a simple dashboard that shows SLA compliance: freshness, volume, uniqueness over time.
Who this is for
- Analytics Engineers who own modeling and testing in the warehouse.
- Data Engineers building ingestion and transformation pipelines.
- BI Developers who rely on stable, well-defined datasets.
Prerequisites
- Comfortable with SQL and warehouse concepts.
- Basic understanding of data modeling and dimensional design.
- Familiarity with testing in your stack (for example, dbt tests, Great Expectations, or SQL-based checks).
Learning path
- Before this: data modeling fundamentals, basic data quality tests.
- This lesson: define contracts, map SLAs to monitors, plan change management.
- Next: incident response, SLAs dash-boarding, and producer-validated schemas.
Next steps
- Pick one critical dataset and ship a minimal contract this week.
- Add two monitors tied to real SLAs (freshness + uniqueness).
- Schedule a 30-minute review with the producer team to agree on change windows.
Mini challenge (15–20 min)
Choose any event stream you use (e.g., page_view or checkout). Draft a one-page contract: five fields with definitions, freshness 10–20 minutes, completeness 99%, and a change policy. Then write two SQL checks to enforce it.
Note: The quick test below is available to everyone. Only logged-in users get saved progress.