How to learn Environment Configuration Dev Stage Prod for ETL Tooling And Implementation in ETL Developer for free

Who this is for

ETL Developers and Data Engineers who need reliable Dev/Stage/Prod environments to build, test, and release data pipelines without breaking production.

Prerequisites

Basic ETL pipeline knowledge (sources, transforms, targets)
Comfort using a code repo, environment variables, and a secrets store
Familiarity with one orchestration tool (e.g., Airflow, ADF, Prefect) and one warehouse (e.g., Snowflake, BigQuery, Redshift)

Why this matters

Real tasks you will face:

Spin up a Dev environment to safely test a new transformation.
Promote a pipeline to Stage with realistic data volumes and credentials.
Schedule in Prod with different windows, quotas, and access controls.
Rotate secrets without code changes or outages.
Roll back quickly if a release causes bad data.

Concept explained simply

Dev, Stage, Prod are separate lanes for the same car (your code artifact). The car stays the same; only the road signs change: credentials, endpoints, resource sizes, schedules, and feature flags. Your goal: same code, different configuration per lane.

Mental model

Immutable code artifact: built once, promoted across environments.
Mutable configuration: injected per environment at deploy/run time.
Strict separation: no Dev secrets in Stage/Prod; no Prod data paths in Dev.

Pro tip: The 12-factor-style approach

Store configuration in the environment or config files, not in code.
Keep secrets in a managed secrets store.
Build once, promote the same artifact.

Six core rules for reliable environments

One codebase, many configs: parameterize everything that varies by environment.
No shared credentials across environments: unique service principals/keys.
Separate data paths: different schemas/buckets for each environment.
Same DAG/job names pattern with suffixes/prefixes (e.g., ingest_customers_dev).
Feature flags for risky or expensive logic (e.g., enable_backfill=false in Dev).
Automated promotion: deploy the same artifact to Stage, then Prod, with approvals and smoke tests.

What to parameterize per environment

Connection strings and secrets (sources, warehouse, message queues)
Data locations (buckets, containers, database schemas)
Schedules and concurrency limits
Resource sizes (cluster size, node type, worker count)
Feature flags (backfills, alerts, optional transforms)
Access control (roles, service accounts)

Worked examples

Example 1: Airflow DAG using env-specific Variables and Connections

Prefix Airflow Variables by environment: var.env = DEV/STAGE/PROD.
Lookup endpoints and schema names by env key.
Toggle a backfill flag to prevent heavy Dev runs.

Show sample configuration mapping

{
  "DEV": {
    "src_postgres_conn_id": "pg_dev",
    "warehouse_conn_id": "snowflake_dev",
    "target_schema": "analytics_dev",
    "enable_backfill": false,
    "schedule": "@hourly"
  },
  "STAGE": {
    "src_postgres_conn_id": "pg_stage",
    "warehouse_conn_id": "snowflake_stage",
    "target_schema": "analytics_stage",
    "enable_backfill": true,
    "schedule": "0 * * * *"
  },
  "PROD": {
    "src_postgres_conn_id": "pg_prod",
    "warehouse_conn_id": "snowflake_prod",
    "target_schema": "analytics",
    "enable_backfill": true,
    "schedule": "0/15 * * * *"
  }
}

Example 2: dbt profiles.yml with per-environment targets

One profile with targets dev, stage, prod.
Use env vars to inject secrets at runtime.
Schema naming: schema: analytics_dev, analytics_stage, analytics.

Show sample profiles.yml

my_project:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: ${SNOWFLAKE_ACCOUNT}
      user: ${SNOWFLAKE_USER}
      password: ${SNOWFLAKE_PASSWORD}
      role: ANALYTICS_DEV
      database: ANALYTICS
      warehouse: DEV_WH
      schema: analytics_dev
    stage:
      type: snowflake
      account: ${SNOWFLAKE_ACCOUNT}
      user: ${SNOWFLAKE_USER}
      password: ${SNOWFLAKE_PASSWORD}
      role: ANALYTICS_STAGE
      database: ANALYTICS
      warehouse: STAGE_WH
      schema: analytics_stage
    prod:
      type: snowflake
      account: ${SNOWFLAKE_ACCOUNT}
      user: ${SNOWFLAKE_USER}
      password: ${SNOWFLAKE_PASSWORD}
      role: ANALYTICS
      database: ANALYTICS
      warehouse: PROD_WH
      schema: analytics

Example 3: Azure Data Factory with Key Vault per environment

Create three ADF instances or one with three linked Services parameterized by environment.
Use separate Key Vaults: kv-dev, kv-stage, kv-prod.
Parameterize Linked Service JSON to pull secrets from the active vault.

Show sample linked service parameterization

{
  "name": "AzureSqlDatabase_ls",
  "type": "LinkedService",
  "properties": {
    "type": "AzureSqlDatabase",
    "typeProperties": {
      "connectionString": "...;Initial Catalog=@{pipeline().globalParameters.db_name};...",
      "password": {
        "type": "AzureKeyVaultSecret",
        "store": {"referenceName": "@{pipeline().globalParameters.kv_name}", "type": "LinkedServiceReference"},
        "secretName": "sql-password"
      }
    }
  }
}

Design steps: from blank slate to reliable environments

Inventory what varies: list all endpoints, credentials, paths, schedules, resource sizes.
Choose a config mechanism: environment variables, config files (YAML/JSON), or orchestrator variables.
Centralize secrets in a managed store; reference them in config by key.
Define naming patterns for schemas/buckets and job IDs per environment.
Add feature flags for risky or costly tasks (e.g., enable_backfill, enable_alerts).
Create promotion checks: automated tests, data smoke tests, and rollbacks.

Checklist: Ready for Stage?

Code artifact is the same as Dev build.
Stage credentials and roles are distinct and valid.
Data paths point to stage-specific storage/schema.
Schedules adjusted to moderate load.
Smoke tests defined: row counts, freshness, null checks.
Rollback plan documented.

Common mistakes and self-check

Hardcoding endpoints in code. Self-check: Can you switch environments without editing code?
Reusing Prod credentials in Dev. Self-check: Do service principals and keys differ per environment?
Shared data paths. Self-check: Do schemas/buckets include env suffix/prefix?
Promotion by rebuilding code. Self-check: Is the artifact identical across Stage/Prod?
No smoke tests. Self-check: Do you have 2–3 automatic data validations per job?
Lack of rollback. Self-check: Can you disable a release and revert config in minutes?

Practical projects

Project 1: Convert an existing pipeline to use environment-based config with a secrets store. Add at least two feature flags.
Project 2: Implement a promote-to-stage workflow with a smoke test DAG/job that validates row counts and freshness.
Project 3: Build a blue/green config toggle for a target schema and practice flipping between them safely.

Practice: Exercises

Do these now. Your answers can be simple text/YAML. A sample solution is available for each.

Exercise 1: Author a three-environment config

Create a single config file that defines Dev/Stage/Prod parameters for a pipeline that reads Postgres and writes to Snowflake. Include: connection names, target schema, schedule, and a backfill flag. Keep secrets referenced via keys, not hardcoded.

Tip: Include naming patterns

Schemas: analytics_dev, analytics_stage, analytics
Connections: pg_dev/pg_stage/pg_prod and snowflake_dev/stage/prod

Exercise 2: Promotion and rollback plan

Write a short, step-by-step plan for promoting the same artifact from Dev to Stage, then Prod. Include: approvals, smoke tests, feature flags to toggle, and a rollback path.

Tip: Keep it concise

3–7 steps per environment
Clearly state success criteria to proceed

Exercise checklist

Parameters vary by environment without code edits
No secrets are stored in the file; only references/keys
Promotion steps include smoke tests and rollback

Learning path

Start: Solidify parameterization with config files/env vars.
Then: Add secrets management and rotate a secret without code changes.
Next: Implement automated smoke tests and feature flags.
Finally: Create an approval-based promotion flow and rollback routine.

Next steps

Apply the exercises to one real pipeline at work or in a lab project.
Add one more environment-specific constraint (e.g., stricter concurrency in Prod).
Document your environment matrix in your repo to help teammates.

Mini challenge

In your current or sample project, add a single toggle enable_backfill. Keep it off in Dev/Stage and on in Prod. Prove it works by showing the scheduled run configuration per environment.

Quick Test

This short test is available to everyone. Only logged-in learners will see saved progress.

Menu

Environment Configuration Dev Stage Prod

Table of Contents