luvv to helpDiscover the Best Free Online Tools
Topic 3 of 8

Reusable Modules And Standards

Learn Reusable Modules And Standards for free with explanations, exercises, and a quick test (for Data Platform Engineer).

Published: January 11, 2026 | Updated: January 11, 2026

Why this matters

As a Data Platform Engineer, you repeatedly provision similar building blocks: VPCs, IAM roles, S3 data lakes, Kafka topics, Databricks workspaces, Airflow clusters, and monitoring. Without reusable modules and standards, each team does it differently—leading to drift, security gaps, and slow delivery. Reusable modules let you ship secure, consistent infrastructure quickly across environments (dev, stage, prod) and projects.

  • Real tasks: create a secure S3 data lake with encryption and lifecycle; standardize Kafka topics; enforce tags for cost and lineage; roll out a new data platform to multiple regions.
  • Outcome: faster provisioning, fewer mistakes, easier audits, and predictable upgrades.

Note: The Quick Test is available to everyone; only logged-in users get saved progress.

Concept explained simply

Think of IaC modules as Lego blocks. Each block does one thing well (e.g., a secure S3 bucket). Standards are the rules that make blocks compatible: naming, tags, variables, outputs, and versioning. With consistent blocks and rules, anyone can assemble a reliable platform.

Mental model

  • Interface: clear inputs (variables) and outputs (references) like a function signature.
  • Contract: documented behavior, defaults, and constraints.
  • Versioned: changes are tracked using semantic versioning (MAJOR.MINOR.PATCH).
  • Portable: environment differences handled via inputs, not copy-paste.

Standards to adopt

  • Naming: predictable resource names, e.g., <org>-<platform>-<env>-<component>. Use lower case and hyphens.
  • Tags/labels: enforce in modules (owner, cost_center, env, data_classification, system).
  • Security defaults: encryption at rest, least-privilege IAM, private networking where possible.
  • Inputs/outputs: minimal, explicit variables; sensible secure defaults; clear outputs.
  • Versioning: pin module and provider versions; use semantic versioning.
  • Structure: keep modules/ (reusable) separate from stacks/ or envs/ (instantiations).
  • Testing: validate, lint, and plan examples before release.
  • Docs: README with inputs, outputs, examples, and change log.

Worked examples

Example 1: Terraform module for a secure S3 data bucket

Show code
# modules/s3_data_bucket/variables.tf
variable "name" { type = string }
variable "tags" { type = map(string) }
variable "versioning" { type = bool default = true }
variable "lifecycle_days_to_glacier" { type = number default = 90 }

# modules/s3_data_bucket/main.tf
resource "aws_s3_bucket" "this" {
  bucket = var.name
  tags   = var.tags
}

resource "aws_s3_bucket_versioning" "this" {
  bucket = aws_s3_bucket.this.id
  versioning_configuration { status = var.versioning ? "Enabled" : "Suspended" }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "this" {
  bucket = aws_s3_bucket.this.id
  rule { apply_server_side_encryption_by_default { sse_algorithm = "AES256" } }
}

resource "aws_s3_bucket_lifecycle_configuration" "this" {
  bucket = aws_s3_bucket.this.id
  rule {
    id     = "transition-to-glacier"
    status = "Enabled"
    transition { days = var.lifecycle_days_to_glacier storage_class = "GLACIER" }
  }
}

# modules/s3_data_bucket/outputs.tf
output "bucket_id" { value = aws_s3_bucket.this.id }
# envs/dev/data_bucket.tf
module "data_bucket" {
  source = "../modules/s3_data_bucket"
  name   = "acme-data-dev-raw"
  tags = {
    owner = "data-platform"
    env   = "dev"
    system = "data-lake"
    cost_center = "dwh"
    data_classification = "internal"
  }
}

# envs/prod/data_bucket.tf
module "data_bucket" {
  source = "../modules/s3_data_bucket"
  name   = "acme-data-prod-raw"
  tags = {
    owner = "data-platform"
    env   = "prod"
    system = "data-lake"
    cost_center = "dwh"
    data_classification = "confidential"
  }
  lifecycle_days_to_glacier = 30
}

Example 2: Enforcing standard tags via locals

Show code
# modules/_standards/tags.tf
variable "env" { type = string }
variable "owner" { type = string default = "data-platform" }
variable "system" { type = string }
variable "extra_tags" { type = map(string) default = {} }

locals {
  required = {
    owner  = var.owner
    env    = var.env
    system = var.system
  }
  tags = merge(local.required, var.extra_tags)
}

output "tags" { value = local.tags }
# usage inside another module
module "std_tags" {
  source = "../_standards"
  env    = var.env
  system = "data-lake"
  extra_tags = { cost_center = "dwh", data_classification = "internal" }
}

resource "aws_kms_key" "lake" {
  description = "Data lake key"
  tags        = module.std_tags.tags
}

Example 3: Reusable Kafka topic module

Show code
# modules/kafka_topic/variables.tf
variable "name" { type = string }
variable "partitions" { type = number default = 3 }
variable "replication" { type = number default = 3 }
variable "retention_ms" { type = number default = 604800000 } # 7 days

# modules/kafka_topic/main.tf
resource "kafka_topic" "this" {
  name               = var.name
  partitions         = var.partitions
  replication_factor = var.replication
  config = {
    "retention.ms" = tostring(var.retention_ms)
    "cleanup.policy" = "delete"
  }
}

output "topic_name" { value = kafka_topic.this.name }
# envs/prod/streaming.tf
module "orders_topic" {
  source     = "../modules/kafka_topic"
  name       = "acme-orders-prod"
  partitions = 12
  retention_ms = 2592000000 # 30 days
}

How to structure your repo

iac/
  modules/
    s3_data_bucket/
      main.tf
      variables.tf
      outputs.tf
      README.md
    kafka_topic/
      main.tf
      variables.tf
      outputs.tf
      README.md
    _standards/
      tags.tf
  envs/
    dev/
      main.tf
      data_bucket.tf
    prod/
      main.tf
      data_bucket.tf
  providers.tf
  versions.tf
  • modules/: reusable building blocks
  • envs/: instantiations per environment
  • Pin provider/module versions in versions.tf to ensure repeatable builds

Versioning and compatibility

  • MAJOR: breaking changes (rename variables, remove outputs, different default behavior that breaks plans)
  • MINOR: backward-compatible features (new optional vars, new outputs)
  • PATCH: fixes with no interface change

Rules: avoid breaking outputs/variables; when unavoidable, release a new major and provide a migration note. Pin versions in envs to avoid surprise upgrades.

Testing and validation of modules

  • Validate: terraform validate on modules and examples
  • Lint: static checks for naming, deprecated fields, and style
  • Plan examples: keep an examples/ folder per module; run terraform plan before releasing
  • Smoke deploy: for critical modules, deploy to a sandbox and destroy
Release checklist
  • Update README with inputs/outputs/examples
  • Run validate and lint
  • Plan examples with pinned providers
  • Tag version (e.g., v0.3.0)

Security and policy integration

  • Make secure the default: encryption, private networking, least privilege
  • Policy-as-code: design modules to pass organization policies (e.g., required tags, blocked public buckets)
  • Expose safe toggles only; avoid exposing raw, risky flags by default

Who this is for

  • Data Platform Engineers building repeatable data infrastructure
  • Data Engineers owning pipelines but needing consistent infra patterns
  • Platform/SRE partners standardizing cloud resources

Prerequisites

  • Basic Terraform or equivalent IaC knowledge (resources, variables, outputs)
  • Familiarity with your cloud provider’s core services (networking, storage, IAM)
  • CLI access to a sandbox account

Learning path

  1. Wrap a single resource into a minimal module
  2. Add standards (naming, tags, security defaults)
  3. Introduce versioning and examples
  4. Create environment stacks and pin versions
  5. Add tests and validation to your workflow

Exercises

Complete these in a sandbox account. Keep your code under version control. Everyone can take the Quick Test; saved progress is for logged-in users.

Exercise 1 — Secure S3 data bucket module

Goal: Create a reusable module that provisions a secure S3 bucket with versioning, encryption, lifecycle, and standard tags. Instantiate it for dev and prod.

  • Requirements:
    • Inputs: name, env, data_classification, extra_tags (map)
    • Defaults: versioning on, encryption on (AES256), lifecycle to GLACIER after 90 days
    • Outputs: bucket_id
    • Enforce tags: owner=data-platform, env, system=data-lake, plus extra_tags
  • Deliverables:
    • Module code
    • Two env instantiations (dev, prod) with different names and data_classification
Starter checklist
  • Create modules/s3_data_bucket with variables.tf, main.tf, outputs.tf
  • Add a small README listing inputs/outputs
  • Run terraform validate and plan

Exercise 2 — Standard interface and version pinning

Goal: Publish a simple standards module that returns required tags and show how to pin and use a module version from envs.

  • Requirements:
    • Create modules/_standards that composes required and extra tags
    • Expose inputs: env, system, owner (default), extra_tags
    • Tag a version (e.g., v0.1.0) and reference it locally via a source path and a comment with intended tag
  • Deliverables:
    • One env stack that calls _standards and applies tags to a resource
    • versions.tf pinning provider version
Exercise tips
  • Keep module inputs minimal and clear
  • Provide sensible defaults for security
  • Use locals to merge tags

Common mistakes

  • Too many inputs: leads to confusion. Self-check: can I remove or default any input?
  • Leaking provider details: keep interfaces cloud-agnostic when possible
  • No version pinning: unexpected upgrades. Self-check: do envs pin both provider and module versions?
  • Copy-paste per environment: prefer inputs and small deltas via variables
  • Missing required tags: enforce in modules, not in envs

Practical projects

  • Data lake foundation: modules for raw/curated buckets, KMS keys, access roles, and Athena/Glue configuration
  • Streaming backbone: modules for Kafka topics with standardized retention and compaction policies
  • Workspace bootstrap: module for a Databricks/Airflow environment with IAM roles, logs, metrics, and tags

Mini challenge

Create a module bundle that provisions a data ingestion path: source bucket + KMS + IAM role with restricted access. Expose only three inputs: env, system, and data_classification. Ensure all resources inherit your standard tags and security defaults. Run validate and plan for dev and prod.

Next steps

  • Harden modules by adding validation rules and preconditions
  • Add example folders and run plans before every release
  • Extend standards to include logging, metrics, and backup policies

Practice Exercises

2 exercises to complete

Instructions

Create modules/s3_data_bucket with variables for name, env, data_classification, and extra_tags. Defaults: versioning on, AES256 encryption, lifecycle to GLACIER after 90 days. Output bucket_id. Instantiate in envs/dev and envs/prod with different names and classifications. Enforce standard tags inside the module using env/system/owner plus extra_tags.

  • Run terraform validate and plan for both envs
  • Keep code DRY and documented
Expected Output
A reusable module that provisions an S3 bucket with encryption, versioning, lifecycle, and enforced tags; two successful plans for dev and prod with different names and tag values.

Reusable Modules And Standards — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Reusable Modules And Standards?

AI Assistant

Ask questions about this tool