Who this is for
- Data Engineers setting up reliable pipelines in orchestrators (Airflow, Prefect, Dagster, etc.).
- Platform/ML Ops engineers who need safe promotion from dev to production.
- Analytic engineers who run jobs that touch sensitive or business-critical data.
Prerequisites
- Basic understanding of job orchestration (DAGs/flows, schedules, retries).
- Familiarity with configuration files (YAML/JSON), environment variables, and secrets.
- Git basics: branches, tags, pull requests, and CI/CD concepts.
Why this matters
In real teams, pipelines evolve constantly. A single bad change can break dashboards, alerts, or payments. Separating environments (dev, stage, prod) reduces blast radius, enables safe testing with realistic data, and creates a repeatable promotion path.
Typical professional tasks:
- Develop a new DAG in dev with small sample data and mocked services.
- Validate schema changes and data quality in stage with production-like schemas and masked data.
- Promote to prod with a controlled rollout, monitoring, and rollback plan.
Concept explained simply
Environment separation means you run the same pipeline in multiple places with different risk levels. Dev is for building and quick iteration. Stage (a.k.a. test/preprod) is for realistic checks. Prod is for real business data and users.
Mental model
Think of three lanes on a highway:
- Dev lane: slow, short distance, easy to pull over. Lots of debugging.
- Stage lane: medium speed, almost same road conditions as the highway, but still safe to stop.
- Prod lane: fast, high traffic, crashes are costly—so you enter only after passing the first two lanes.
The car (your code artifact) should be the same when moving lanes; you only change the road signs (configuration and connections).
Key principles and patterns
- Same artifact, different config: build once (e.g., container image, wheel) and promote; inject environment-specific config at runtime.
- Isolate resources: separate secrets, storage, queues, and databases per environment.
- Least privilege: service accounts and connections with minimal rights per environment.
- Data protection: stage uses production-like schemas with masked/anonymized data.
- Deterministic promotion: pass checks in dev, then stage, then prod with gates (tests, approvals).
- Observable rollouts: metrics, logs, and alerts per environment; tag runs with env labels.
Worked examples
Example 1: Environment-aware Airflow DAG
Goal: One DAG definition runs in dev, stage, prod with different connections and parameters.
# Pseudocode (concept applies to most orchestrators)
import os
ENV = os.getenv("DEPLOY_ENV", "dev") # dev|stage|prod
# Use env-specific connection IDs (created in each Airflow env)
SRC_CONN = f"postgres_{ENV}"
DST_CONN = f"warehouse_{ENV}"
# Pick buckets/tables by suffix
RAW_BUCKET = f"raw-data-{ENV}"
TABLE_SUFFIX = f"_{ENV}"
# Guardrails
if ENV == "prod":
retries = 3
sla = "30m"
else:
retries = 0
sla = None
# DAG tasks use SRC_CONN/DST_CONN and suffixes
Outcome: One codebase; differences live in connections, variables, and runtime environment variables.
Example 2: dbt targets for dev/stage/prod
# profiles.yml excerpt
my_project:
target: dev
outputs:
dev:
type: postgres
host: dev.db
schema: analytics_dev
user: svc_dev
stage:
type: postgres
host: stage.db
schema: analytics_stage
user: svc_stage
prod:
type: postgres
host: prod.db
schema: analytics
user: svc_prod
Promotion flow: run dbt in stage with masked prod-like data and full tests. Only promote to prod if tests pass.
Example 3: Secrets, accounts, and naming
- Secrets store: separate paths per env, e.g., secret/data/dev/... vs secret/data/prod/...
- Service accounts:
sa-etl-dev,sa-etl-stage,sa-etl-prodwith least privilege. - Naming strategy: buckets/tables:
raw_dev,raw_stage,raw.
Deployment readiness checklist
- Config injection: environment variables or config files select connections and destinations.
- Secrets: stored outside code; rotated per environment.
- Data: stage has production-like schemas and masked records.
- Observability: logs, metrics, and alerts labeled by environment.
- Rollback: a clear path to revert to the previous artifact/tag in prod.
Exercises
Exercise 1 — Design environment-aware config
Create a minimal config that parameterizes connections, storage, and output tables for dev, stage, and prod. Use keys: src_conn, dst_conn, bucket, table_suffix, and retries.
When done, self-check:
- The artifact (code) remains identical across environments.
- No secret values appear in the file; only references or connection IDs.
- Prod has stricter settings (more retries, SLAs).
Exercise 2 — Plan a safe promotion
Draft a promotion plan from dev to stage to prod for a new pipeline that loads orders data. Include: gates (tests/approvals), observability checks, and rollback steps.
When done, self-check:
- Stage validation uses production-like schemas and masked data.
- Promotion uses a single built artifact (tag) through all envs.
- Rollback is specific and quick (revert tag, disable new DAG run).
Common mistakes and how to self-check
- Branching drift: building different binaries for each environment. Fix: build once; promote the same tag.
- Hard-coded secrets: credentials inside code. Fix: use secret manager or orchestrator connections.
- Stage unlike prod: toy schemas or volumes in stage. Fix: mirror schemas; mask or subset data carefully.
- Skipping gates: promoting without tests. Fix: enforce CI/CD checks and approvals.
- No rollback: unclear reversion steps. Fix: document revert-to-previous-tag and disable/stop procedures.
Practical projects
- Project 1: Convert a single-environment DAG into a dev/stage/prod setup with per-env connections and buckets.
- Project 2: Implement a canary release for a transformation job (run in prod on 5% of partitions before 100%).
- Project 3: Add data quality tests in stage and block promotion on failure; include rollback to previous artifact.
Learning path
- Introduce runtime configuration (env vars, YAML) and secrets separation.
- Add per-environment connections and resource names.
- Create stage with production-like schemas and masked data.
- Set up CI checks: unit tests, linting, dbt/SQL tests, and dry-runs.
- Add promotion gates, approvals, and rollback playbooks.
- Introduce canary/blue-green rollouts and runtime monitoring.
Next steps
- Automate your promotion pipeline so passing stage tests auto-opens a prod deployment with approval.
- Standardize naming for resources across environments.
- Document your rollback procedures and test them quarterly.
Quick Test: how it works
You can take the quick test below any time. Everyone can use it for free; only logged-in users get saved progress.
Mini challenge
Your team wants to change a table schema (add a nullable column). Describe, in 5–7 bullet points, how you would roll this out across dev, stage, and prod with zero downtime and a rollback plan.