luvv to helpDiscover the Best Free Online Tools
Topic 5 of 8

Encryption And Key Management Basics

Learn Encryption And Key Management Basics for free with explanations, exercises, and a quick test (for Data Platform Engineer).

Published: January 11, 2026 | Updated: January 11, 2026

Why this matters

As a Data Platform Engineer, you move and store sensitive data. Encrypting data and managing keys correctly protects customers, keeps regulators happy, and prevents costly incidents. You will:

  • Enable TLS between services (ingestion, message brokers, warehouses).
  • Encrypt data at rest in object storage, data lakes, and databases.
  • Design key hierarchies and rotation plans that don’t break pipelines.
  • Control who/what can use keys and audit every key operation.

Concept explained simply

Encryption scrambles data so only someone with the right key can read it. There are two main kinds:

  • Symmetric encryption: one key to encrypt and decrypt (fast; used for large data).
  • Asymmetric encryption: key pair (public/private). Often used for key exchange and identity.

Key management is how you create, store, use, rotate, and retire keys safely—ideally in a managed Key Management Service (KMS) or Hardware Security Module (HSM).

Mental model: Envelopes and lockers

Think of data as a letter:

  • You put the letter in a small envelope and seal it with a Data Encryption Key (DEK). This is symmetric and fast.
  • You then lock that small envelope inside a bigger locker using a Key Encryption Key (KEK) kept in a secure vault (KMS/HSM). This is called envelope encryption.
  • To read the letter later, unlock the locker (unwrap the DEK with the KEK), then open the small envelope (decrypt with the DEK).

Core building blocks

  • Encryption in transit: TLS between clients, services, brokers, and databases. Use certificates, verify peers, disable weak ciphers.
  • Encryption at rest: Storage-level (volumes, object store) and application-level (you encrypt before writing). Envelope encryption is common for large files.
  • Key hierarchy: Customer Master Key (CMK/KEK) protects many DEKs. DEKs protect data objects/records.
  • Key lifecycle: generate, label/version, store in KMS/HSM, use (least privilege), rotate, retire, destroy.
  • Access control and audit: restrict who/what can use keys; log every encrypt/decrypt; alert on anomalies.
  • Authenticated encryption: use AEAD modes (e.g., AES-GCM) to detect tampering.
Minimum safe defaults
  • Use TLS 1.2+ everywhere; prefer TLS 1.3.
  • Use AES-256-GCM for symmetric encryption.
  • Store keys in a managed KMS/HSM, never in code or plain env vars.
  • Rotate KEKs yearly or on exposure; rotate DEKs per object/file or partition.
  • Log and alert on KMS usage; block wildcards in key policies.

Worked examples

1) Securing data in transit with TLS

Goal: TLS from ingestion service to broker and database.

  1. Create or obtain certificates for services. Prefer automated rotation.
  2. Enable TLS on your reverse proxy/ingress. Require modern ciphers and mutual TLS if applicable.
  3. Configure the message broker to accept TLS connections and require client auth when feasible.
  4. Force TLS to the database. For example, connection strings often include an ssl mode parameter.
# examples (pseudo-config)
BROKER_TLS_ENABLE=true
BROKER_TLS_CLIENT_AUTH=required
DB_CONN="host=db.my internal port=5432 sslmode=require"
Why this works

TLS authenticates endpoints and encrypts traffic, closing sniffing and man-in-the-middle gaps between pipeline components.

2) Envelope encryption for object files

Goal: Encrypt parquet files before uploading to object storage.

  1. Ask KMS to generate a random DEK, or generate locally with a CSPRNG.
  2. Encrypt the file with AES-256-GCM using the DEK. Capture nonce/IV and auth tag.
  3. Ask KMS to wrap (encrypt) the DEK with a KEK (CMK). Receive wrapped_DEK and key_id/version.
  4. Store alongside the ciphertext: algorithm, key_id/version, wrapped_DEK, nonce, auth_tag, and created_at.
{
  "algo": "AES-256-GCM",
  "kek_id": "analytics-cmk",
  "kek_version": "v3",
  "wrapped_dek_b64": "...",
  "nonce_b64": "...",
  "auth_tag_b64": "...",
  "created_at": "2026-01-11T10:00:00Z"
}

Decryption: unwrap DEK with KMS + KEK, then decrypt file with DEK, nonce, and auth_tag.

Operational notes
  • Store metadata with the object (e.g., sidecar .meta file). Without it, you cannot decrypt.
  • Limit who/what can call KMS decrypt. Grant narrow roles to specific services.

3) Rotating keys with zero downtime

Goal: Rotate KEK v1 to KEK v2 without breaking reads.

  1. Create KEK v2 in KMS. Update policies and aliases to favor v2 for new encrypt operations.
  2. Write path: start wrapping new DEKs with KEK v2. Reads must still unwrap old wrapped DEKs with v1 or v2.
  3. Background rewrap: rewrap existing wrapped DEKs from v1 to v2 (no need to re-encrypt large data if using envelope encryption).
  4. Verify: sample decrypts, compare checksums, confirm logs show v1 is unused for N days.
  5. Retire: disable v1 usage, then schedule destruction per policy.
Common pitfalls
  • Re-encrypting entire datasets unnecessarily. Rewrapping DEKs is faster.
  • Cutting off reads tied to old versions too soon. Keep dual-read until metrics prove safe.

One-day secure baseline

  • Morning: turn on TLS everywhere; reject weak ciphers; enable certificate validation.
  • Midday: define a KEK in KMS and implement envelope encryption in one high-value path.
  • Afternoon: tag/label keys, enable audit logs/alerts, restrict key usage to specific identities. Write a short rotation runbook.
Checklist
  • TLS on all hops.
  • Data at rest uses AEAD (e.g., AES-GCM).
  • Keys live in KMS/HSM; no keys in code.
  • Key policies least-privilege; logging enabled.
  • Rotation plan documented.

Exercises

Do these to solidify your understanding. The quick test is further below.

Exercise 1: Design envelope encryption metadata

Create a concise metadata schema for a parquet object you encrypt before upload. Include algorithm, wrapped DEK, KEK id/version, nonce, auth tag, creation time. State how you will verify integrity during decrypt.

Sample output format
{
  "algo": "...",
  "kek_id": "...",
  "kek_version": "...",
  "wrapped_dek_b64": "...",
  "nonce_b64": "...",
  "auth_tag_b64": "...",
  "created_at": "..."
}

Exercise 2: Key rotation runbook (KEK v1 -> v2)

Draft a one-page runbook: triggers, approvals, steps to introduce v2, dual-read period, rewrap plan, validation, rollback, and final retirement of v1.

Self-check after exercises
  • Your metadata includes everything needed to decrypt and verify authenticity.
  • Your runbook allows reads during rotation and has a rollback path.
  • Monitoring covers KMS usage and error rates during rewrap.

Common mistakes and how to self-check

  • Missing metadata (nonce/auth tag). Self-check: pick a sample object and attempt a full decrypt using only stored metadata.
  • Keys in environment variables or code. Self-check: search repos and CI logs for key-like patterns; rotate if found.
  • Assuming “storage provider encrypts” is enough. Self-check: confirm application-level encryption where your threat model requires it.
  • No auditing. Self-check: verify you can answer “who decrypted this object and when?”
  • Big-bang rotations. Self-check: ensure your plan supports dual-read and staged rollout.

Mini challenge

Your analytics team wants to share daily parquet files cross-account. Propose a minimal design that:

  • Encrypts files with DEKs (AES-GCM).
  • Wraps DEKs with a KEK in your KMS and permits the consumer account to decrypt via scoped policy.
  • Logs all decrypt calls and alerts on anomalies.

Write 6–8 bullet points covering metadata, policies, and rotation.

Practical projects

  • Build a CLI that envelope-encrypts files and emits a sidecar JSON metadata file.
  • Add mutual TLS between a data ingestion service and a message broker, with certificate rotation.
  • Create a key rotation simulator that rewraps N test objects and reports timing, failures, and coverage.

Who this is for

  • Data Platform Engineers shipping pipelines and storage layers.
  • SREs and security-minded developers integrating with data systems.

Prerequisites

  • Basic networking (TLS, certificates at a high level).
  • Familiarity with object storage, databases, and message brokers.
  • Comfort with JSON/YAML configuration and simple command-line tools.

Learning path

  • Start: Encryption basics and envelope encryption (this lesson).
  • Next: Secrets management and IAM for data services.
  • Then: Auditing, anomaly detection, and incident response for key events.
  • Advanced: Tokenization, field-level encryption, and differential privacy patterns.

Next steps

  • Implement TLS verification on at least one service-to-service hop today.
  • Choose a dataset and pilot envelope encryption with full metadata.
  • Draft and peer-review your key rotation runbook.

Quick test

The quick test is available to everyone; log in to save your progress.

Practice Exercises

2 exercises to complete

Instructions

Create a metadata JSON for a single encrypted parquet file. Include:

  • algo (e.g., AES-256-GCM)
  • kek_id and kek_version
  • wrapped_dek_b64
  • nonce_b64 and auth_tag_b64
  • created_at (ISO)

Describe in 1–2 sentences how integrity is verified during decrypt.

Expected Output
{ "algo": "AES-256-GCM", "kek_id": "analytics-cmk", "kek_version": "v2", "wrapped_dek_b64": "...", "nonce_b64": "...", "auth_tag_b64": "...", "created_at": "2026-01-11T10:00:00Z" } Integrity via AEAD tag verification before returning plaintext.

Encryption And Key Management Basics — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Encryption And Key Management Basics?

AI Assistant

Ask questions about this tool