Menu

Infrastructure As Code

Learn Infrastructure As Code for Platform Engineer for free: roadmap, examples, subskills, and a skill exam.

Published: January 23, 2026 | Updated: January 23, 2026

What you’ll learn and why it matters

Infrastructure as Code (IaC) lets Platform Engineers define, version, review, and automate cloud resources using code. It reduces manual errors, speeds up delivery, and makes environments reproducible across dev, stage, and prod.

  • Spin up consistent environments on demand.
  • Use pull requests, code reviews, and CI/CD for infra changes.
  • Bake in security and compliance with policies.
  • Detect and remediate drift quickly.
  • Enable teams with reusable, standards-compliant modules.

Who this is for

  • Platform Engineers building and maintaining shared cloud platforms.
  • Backend Engineers owning service infrastructure.
  • SREs seeking predictable, automated environments.

Prerequisites

  • Basic cloud knowledge (e.g., compute, networking, IAM concepts).
  • Git fundamentals: branching, PRs, code review.
  • CLI comfort (shell, environment variables).
  • Optional: CI/CD basics to run plans and applies safely.

Learning path

1) Terraform core

  • Install Terraform; learn providers, resources, variables, outputs, state.
  • Use workspaces or directory layout for environments.
  • Run init/plan/apply/destroy and interpret outputs.

2) Reusable modules and standards

  • Create modules with clear inputs/outputs.
  • Adopt naming, tagging, and file structure conventions.
  • Version modules; add examples and READMEs.

3) Environments: dev, stage, prod

  • Separate state and config per environment.
  • Use variable files or Terraform Cloud/Workspaces.
  • Promote changes from dev → stage → prod via PRs.

4) Networking and IAM as code

  • Model VPCs, subnets, routes, SGs, and peering.
  • Write least-privilege IAM roles/policies for workloads and CI.

5) Secrets and configuration

  • Keep secrets out of state when possible; mark sensitive variables.
  • Integrate with secret stores (e.g., SSM Parameter Store, Vault).
  • Template app configs with environment-specific values.

6) Policy as Code

  • Write policies to enforce tagging, regions, and encryption.
  • Fail plans that violate guardrails before they reach prod.

7) Drift detection and remediation

  • Detect drift using plans; alert on differences.
  • Codify desired state; remove manual changes.

8) Change management

  • PR-based plans with mandatory review and policy checks.
  • Apply gates: approvals, maintenance windows, change freeze rules.

Worked examples

1) Terraform basics: versioned S3 bucket with outputs
# main.tf
terraform {
  required_version = ">= 1.5.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.0"
    }
  }
}

provider "aws" {
  region = var.region
}

resource "aws_s3_bucket" "logs" {
  bucket = var.bucket_name
  tags = {
    env     = var.env
    owner   = var.owner
    purpose = "access-logs"
  }
}

resource "aws_s3_bucket_versioning" "logs" {
  bucket = aws_s3_bucket.logs.id
  versioning_configuration { status = "Enabled" }
}

output "bucket_arn" {
  value = aws_s3_bucket.logs.arn
}

# variables.tf
variable "region" { type = string }
variable "bucket_name" { type = string }
variable "env" { type = string }
variable "owner" { type = string }

# commands
# terraform init
# terraform plan -var="region=us-east-1" -var="bucket_name=acme-logs-dev" -var="env=dev" -var="owner=platform"
# terraform apply -auto-approve

Result: a versioned bucket with consistent tags and an output you can reuse in other modules.

2) Reusable VPC module (usage example)
# modules/vpc/variables.tf
variable "name" { type = string }
variable "cidr" { type = string }
variable "az_count" { type = number }

# modules/vpc/main.tf (simplified)
resource "aws_vpc" "this" {
  cidr_block = var.cidr
  tags = { Name = var.name }
}

resource "aws_subnet" "private" {
  count             = var.az_count
  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.cidr, 4, count.index)
  map_public_ip_on_launch = false
  tags = { Tier = "private", Name = "${var.name}-priv-${count.index}" }
}

output "vpc_id" { value = aws_vpc.this.id }
output "private_subnet_ids" { value = aws_subnet.private[*].id }

# envs/dev/main.tf
module "vpc" {
  source   = "../../modules/vpc"
  name     = "acme-dev"
  cidr     = "10.10.0.0/16"
  az_count = 2
}

Result: a reusable foundation you can version and promote across environments.

3) Least-privilege IAM role for CI to run Terraform
resource "aws_iam_role" "tf_ci" {
  name = "tf-ci-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Effect = "Allow",
      Principal = { Service = "github-actions.amazonaws.com" },
      Action = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_policy" "tf_limited" {
  name   = "tf-ci-limited"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      { Effect = "Allow", Action = ["ec2:Describe*", "s3:ListAllMyBuckets"], Resource = "*" },
      { Effect = "Allow", Action = ["s3:PutObject", "s3:GetObject"], Resource = ["arn:aws:s3:::my-tf-state/*"] }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "attach" {
  role       = aws_iam_role.tf_ci.name
  policy_arn = aws_iam_policy.tf_limited.arn
}

Grant the minimal permissions needed for plans, state access, and read-only discovery.

4) Secrets handling with sensitive variables (keep secrets out of state)
# variables.tf
variable "db_password" {
  type      = string
  sensitive = true
}

# main.tf (pass secret to a service without writing it to state)
resource "aws_ssm_parameter" "db_password" {
  name        = "/acme/${var.env}/db_password"
  type        = "SecureString"
  value       = var.db_password
  overwrite   = true
}

# CLI usage (avoid typing in terminal history)
# export TF_VAR_db_password=$(pbpaste)  # or set in CI secret store
# terraform apply -var="env=dev"

Mark variables as sensitive and use a secret store. Avoid logging or outputting secrets.

5) Drift detection and remediation

Detect manual changes by running a plan regularly (in CI or on a schedule):

# steps
# 1) terraform init
# 2) terraform plan -detailed-exitcode
# Exit codes: 0 = no changes, 2 = changes present, 1 = error
# If exit code is 2, alert and open a PR to reconcile or revert manual changes.

Always codify the desired state. If something must be changed urgently, follow up with a PR that updates the code.

6) Policy as Code: deny untagged resources (OPA/Rego example)
package terraform.tags

deny[msg] {
  input.resource.kind == "aws_instance"
  not input.resource.tags.env
  msg := sprintf("Instance %s missing tag 'env'", [input.resource.name])
}

Run policy checks during plan to block resources without required tags. Start with simple rules (tags, regions, encryption) and expand.

Drills and exercises

Common mistakes and debugging tips

Mixing state across environments

Keep separate state backends or workspaces for dev/stage/prod. Name them clearly and restrict access.

Hardcoding values instead of variables

Use variables and tfvars per environment. Hardcoded values block reuse and promotion.

Leaking secrets into state or logs

Mark variables as sensitive, rely on secret stores, and avoid outputs that include secrets. Review CI logs.

Overly permissive IAM policies

Start with read-only and add specific actions as needed. Validate with access advisor and CI policy checks.

Ignoring plan warnings

Warnings often indicate deprecated arguments or potential destructive changes. Fix them before apply.

Manual hotfixes without code updates

Any manual change creates drift. Follow up with a PR that updates code or revert to the desired state.

Mini project: Three-environment microservice platform

  1. Create a modules folder with: vpc, app_role, service (compute + load balancer), and logging.
  2. Define envs/dev, envs/stage, envs/prod with separate state backends and tfvars.
  3. Provision:
    • VPC with private subnets and required routing.
    • Service module (container or VM) with health checks.
    • Least-privilege IAM role for the service to read from a secret store.
    • Centralized logs (e.g., to S3/CloudWatch) with retention policies.
  4. Add a policy that denies resources missing env and owner tags.
  5. Implement CI: on PR, run terraform fmt, validate, and plan. Require approval before apply.
  6. Demonstrate promotion: same module versions, different tfvars per environment.
  7. Simulate drift in dev, detect with plan, and remediate by updating code.
Stretch goals
  • Introduce module version pinning and a changelog.
  • Add cost tags and a budget alarm resource.
  • Create a rollback playbook for failed applies.

Next steps

  • Work through the subskills in order (basics → policies → change management).
  • Finish the mini project and keep it as a portfolio asset.
  • Take the skill exam below to validate your readiness.

Infrastructure As Code — Skill Exam

This exam checks practical understanding of Infrastructure as Code concepts for Platform Engineers: Terraform basics, modules, environments, IAM/networking, secrets, policy-as-code, drift detection, and change management.Everyone can take this exam for free. If you are logged in, your progress and results will be saved; otherwise, you can still complete it without saving.Tip: Aim for clear, safe, and reproducible approaches. Passing score is 70%.

10 questions70% to pass

Have questions about Infrastructure As Code?

AI Assistant

Ask questions about this tool