Who this is for
This lesson is for Platform Engineers and Security-minded Developers who design and operate infrastructure access. If you touch CI/CD, cloud accounts, Kubernetes, secrets, or production support, you need solid IAM and least privilege.
Prerequisites
- Basic understanding of cloud services or Kubernetes objects
- Comfort with YAML/JSON and command-line tools
- Familiarity with authentication (users, tokens, service accounts)
Why this matters
Real platform tasks depend on getting permissions right:
- Provision CI/CD runners with only the permissions they need to deploy
- Grant engineers temporary production access for an on-call incident
- Restrict a data pipeline to read logs but prevent deletion or writes
- Enable auditors to view configurations without changing anything
- Contain blast radius if a key leaks or a pod is compromised
Concept explained simply
IAM answers: Who can do What to Which resource, under Which conditions. Least privilege means the minimum permissions to complete a task—no more, no less, and only for the time necessary.
Mental model
Think of doors and keys:
- Identities: people, services, workloads holding keys
- Actions: what the key can do (read, write, update, delete, admin)
- Resources: the doors (buckets, clusters, databases)
- Conditions: when and how a key works (time, IP, tag/label, environment)
- Decision: default deny, then allow explicitly when all match
Core building blocks
- Identities: users, groups, service accounts, workload identities
- Authentication: SSO, MFA, workload federation
- Authorization: RBAC/ABAC and policy engines (JSON/YAML policy docs)
- Scope: resources and paths; avoid wildcards where possible
- Time: temporary elevation (just-in-time) and automatic expiry
- Boundaries: permission boundaries/constraints that cap maximum power
- Audit: centralized logs for allow/deny and changes
- Break-glass: emergency accounts with controls and monitoring
Worked examples
Example 1 — Read logs from a specific bucket prefix, deny destructive actions
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadOnlyLogs",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::company-logs",
"arn:aws:s3:::company-logs/prod/app/*"
],
"Condition": {
"StringLike": {"s3:prefix": ["prod/app/*"]}
}
},
{ "Sid": "DenyWrites", "Effect": "Deny", "Action": ["s3:DeleteObject", "s3:PutObject"], "Resource": "arn:aws:s3:::company-logs/*" }
]
}
Why this works
It limits read to a specific prefix, uses explicit deny for destructive actions, and avoids wildcard on all buckets.
Example 2 — Kubernetes RBAC: read pods and logs in one namespace
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: support-reader
namespace: prod
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: support-reader-binding
namespace: prod
subjects:
- kind: Group
name: oncall-engineers
roleRef:
kind: Role
name: support-reader
apiGroup: rbac.authorization.k8s.io
Why this works
Namespace-scoped, read-only verbs, and bound to a group rather than individuals for easier reviews.
Example 3 — CI runner service account with minimal build permissions
{
"serviceAccount": "ci-runner",
"allow": [
{"action": "secrets.get", "resource": "secrets/ci/*"},
{"action": "artifacts.write", "resource": "registry/projects/app/*"},
{"action": "deploy.trigger", "resource": "deployments/app-prod"}
],
"deny": [
{"action": "iam.admin", "resource": "*"}
],
"constraints": {
"network": "build-vpc",
"timeToLiveMinutes": 60
}
}
Why this works
Grants only the actions the pipeline needs, adds a 60-minute TTL for tokens, and prevents escalation into IAM administration.
Step-by-step: design least-privilege access
- Define the job to be done: list exact actions (read logs, deploy, rotate secret).
- Inventory resources: which namespaces, buckets, tables, projects.
- Map actions to resources: build an allow-list matrix; avoid "*" where possible.
- Add constraints: time-limited sessions, conditions (env=prod), network locations.
- Draft the policy: start with deny-by-default; add minimal allows; add explicit denies for risky actions.
- Test with a temporary session: run commands the role needs; confirm denies for anything extra.
- Log and review: verify audit logs show attempted overreach is blocked.
- Automate: template roles; use groups; schedule access reviews; enable just-in-time elevation.
Hands-on exercise
Try Exercise 1 below. Draft a least-privilege policy for a support engineer who must:
- Read production application logs in a specific path
- View pod logs in the prod namespace
- Must not delete or modify resources
- Access should auto-expire after 2 hours
Write the policy snippet(s) and a short test plan.
Least-privilege checklist
- Deny-by-default is the starting point
- Actions are the smallest set needed (no wildcard verbs)
- Resource scope is narrow (namespace, path, project)
- Temporary access with TTL or session limits
- Explicit denies for destructive actions where helpful
- Boundaries/constraints prevent escalation
- Group-based bindings, not individual users
- Audit logs enabled and reviewed
Common mistakes and how to self-check
- Over-broad wildcards: search policies for "*" in Action or Resource; replace with explicit lists.
- Standing admin: replace permanent admin with just-in-time roles and approvals.
- Skipping conditions: add environment labels/tags or namespace scoping.
- Direct user bindings: bind groups instead; automate deprovisioning.
- No explicit denies: use carefully to block high-risk verbs (delete, iam:*) where inheritance might allow them.
- Forgetting logs: ensure decision, principal, action, resource appear in centralized audit logs.
Practical projects
- Access map: diagram identities, roles, and resources for one critical system.
- Role catalog: define standard read-only, deployer, and auditor roles with JSON/YAML templates.
- JIT elevation: implement a 1–2 hour admin elevation flow with approvals and logging.
- Break-glass drill: practice emergency access; validate alerts and post-incident review steps.
- Permissions boundary: add a global boundary that prevents iam:* and destructive actions outside pipelines.
Learning path
- Identity foundations: SSO, MFA, groups, provisioning/deprovisioning
- Policy design patterns: RBAC, ABAC, condition keys, boundaries
- Machine identities: workload federation and short-lived tokens
- Audit and monitoring: log routing, alerts on privilege escalation
- Secret management: rotation and scope alignment with IAM
Next steps
- Complete the exercise and compare to the sample solution
- Run the Quick Test to check your understanding
- Note: the test is available to everyone; only logged-in users get saved progress
Mini challenge
Your analytics job currently has read access to all buckets and can delete objects. Redesign it so it can only read objects under