How to learn Security Reviews And Threat Modeling for Security And Privacy Architecture in Data Architect for free

Why this matters

As a Data Architect, your designs move sensitive data across ingestion, storage, processing, and analytics. A single missed threat can lead to data leaks, downtime, or compliance fines. Security reviews and threat modeling help you find risks early and bake in practical mitigations: encryption, access control, isolation, and monitoring.

Real tasks you will face: approving a new PII ingestion pipeline, connecting a BI tool to a warehouse, enabling cross-account data sharing, onboarding a third-party connector, or handling schema evolution in streaming.
Threat modeling reduces rework, clarifies responsibilities, and improves auditability.

Who this is for

Data Architects and Platform Engineers designing or reviewing data pipelines, lakes, warehouses, and streaming systems.
Tech leads who need a repeatable, lightweight review process.

Prerequisites

Basic understanding of data platform components (ingestion, queue/stream, storage, compute, warehouse, BI).
Familiarity with authentication/authorization and encryption concepts.

Concept explained simply

Threat modeling is a structured way to ask: what can go wrong, what are we doing about it, and is that enough? A security review is the meeting and documentation ritual that turns those answers into decisions and backlog items.

Mental model

Use a map-and-attack mindset:

Map the system: draw data flows, assets, and trust boundaries.
Attack the map: enumerate threats with frameworks like STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) and LINDDUN for privacy (Linkability, Identifiability, Non-repudiation, Detectability, Information disclosure, Unawareness, Non-compliance).
Decide: rate risk (likelihood × impact), choose mitigations, document owners and timelines.

Quick glossary

Trust boundary: where different levels of trust meet (e.g., internet to VPC, app to data lake).
Asset: something valuable (PII dataset, service account, encryption keys).
Control: technical or process safeguard (TLS, IAM, masking, monitoring).

A simple step-by-step process

Scope: define the feature, data categories (PII, PCI, health), and success criteria.
Map: draw a data flow diagram (DFD). Mark trust boundaries and assets.
Enumerate threats: run STRIDE for security and LINDDUN for privacy across each data flow and store.
Rate risks: score likelihood and impact (e.g., Low/Med/High). Prioritize High-High first.
Mitigate: select controls (encryption, IAM, network isolation, tokenization, data minimization, logging).
Decide: record accepted, mitigated, or deferred risks with owners and dates.
Validate: tabletop run-through; add tests/monitoring to catch regressions.

Lightweight scoring rubric (use consistently)

Likelihood: Low (rare skill/access), Medium (possible via misconfig), High (common misstep or public exposure).
Impact: Low (non-sensitive, limited blast radius), Medium (internal data/partial outage), High (PII/financial/availability for many users).

Worked examples

Example 1: PII ingestion to data lake

Context: Batch ingestion from a public API -> ingestion service -> object storage (data lake) -> ETL -> warehouse.

Threats (STRIDE):
- Spoofing: fake API source tokens.
- Tampering: data altered in transit to storage.
- Info disclosure: PII exposed in logs or non-prod copies.
- DoS: ingestion spikes exhaust compute/quota.
- EoP: over-privileged service role writes to all buckets.
Privacy (LINDDUN):
- Identifiability/linkability via persistent identifiers across datasets.
- Non-compliance: retention longer than policy; missing consent.
Mitigations:
- Mutual TLS, signed requests, narrow IAM roles, bucket policies with encryption at rest (KMS).
- Mask PII in logs; separate prod/non-prod datasets with scrubbed fixtures.
- Lifecycle rules and retention policies; tokenization for BI.

Example 2: Kafka streaming with consumer groups

Context: Producers -> Kafka -> stream processing -> feature store.

Threats:
- Tampering: unauthenticated producer pushes poisoned events.
- DoS: topic flood; consumer lag grows.
- Info disclosure: plaintext traffic or open security groups.
- EoP: consumer service account reads all topics.
Mitigations:
- SASL authentication, ACLs per topic, network rules.
- Quotas and retention; autoscaling consumers; DLQ for bad events.
- TLS in transit; secret management; least-privilege roles.

Example 3: BI tool connected to warehouse

Context: Analysts use a BI tool with service accounts. Data includes customer tables and derived aggregates.

Threats:
- Info disclosure: direct table access to raw PII.
- Repudiation: no audit of who queried what.
- Non-compliance: exporting full tables to spreadsheets.
Mitigations:
- Row/column-level security; views over raw tables; data masking.
- Query audit logs with alerts; just-in-time privileged access.
- Disable exports for sensitive datasets; aggregate-only sharing.

How to run a security review meeting

People: author (design owner), security champion, data platform rep, privacy/compliance rep, observer.
Inputs: one-page context, DFD with trust boundaries, data classification, draft controls.
Agenda (30–45 min):
- 5 min scope and assumptions.
- 10 min walk the DFD.
- 15 min threat enumeration (STRIDE + LINDDUN).
- 10 min decisions: mitigations, owners, timelines.
- 5 min validation plan and follow-ups.

Templates you can copy

One-page review template

Context: feature/pipeline summary
Data: categories (PII/PCI/PHI), sources, destinations
Diagram: DFD with trust boundaries
Assumptions: key dependencies and out-of-scope
Threats: top 5 with rationale
Controls: selected mitigations by component
Decisions: accept/mitigate/defer with owners and dates
Validation: tests, monitoring, tabletop date

DFD quick notation

[External] -> (Service) -> [Queue/Topic] -> (ETL/Job) -> [Storage]
Trust boundary: =======
Annotate assets: PII, keys, secrets
Annotate controls: TLS, IAM, KMS, RLS/CLS, tokenization

Exercises

Do these before the quick test. Keep your notes; you will reuse them.

Exercise 1 (mirrors ex1): Draw a DFD for an ingestion-to-warehouse flow and list threats using STRIDE/LINDDUN. Prioritize and pick top 3 mitigations.
Exercise 2 (mirrors ex2): Given a set of threats, score risks, propose controls, and write review decisions with owners and timelines.

Self-check checklist

DFD shows all external actors, data stores, and trust boundaries.
Each flow has at least one STRIDE and one LINDDUN consideration.
Risks are prioritized with a clear rationale.
Mitigations map to specific components and are testable.
Decisions include owners and due dates.

Common mistakes and how to self-check

Missing non-production risk: ensure scrubbed data in dev/test; forbid production PII in sandboxes.
Over-privileged roles: review IAM policies for least privilege; rotate keys.
Logging leaks: verify logs and metrics do not include sensitive values.
Unclear ownership: every mitigation has an owner and a date.
No validation: add tests (e.g., automated checks for encryption at rest, RLS/CLS policies) and monitoring alerts.

Practical projects

Secure Data Lake Starter: baseline bucket policies, encryption, access patterns, and lifecycle rules applied via IaC.
Streaming Guardrails: Kafka ACLs, quotas, TLS, consumer lag alerting, and DLQ pattern.
Warehouse Safety Kit: implement column masking, row-level security, and audit logging with example roles.

Learning path

Start: Threat modeling basics (STRIDE, LINDDUN) and simple DFDs.
Next: Cloud IAM and network isolation patterns for data platforms.
Then: Data privacy techniques (masking, tokenization, minimization, retention).
Advance: Automating security checks in CI and platform policies.

Mini challenge

You must enable data sharing with a partner for weekly aggregates. Write two options: (1) share aggregate tables only, (2) share a view with row/column filters. For each, list 3 threats and 3 mitigations, then recommend one option with rationale.

Quick test and progress

The quick test below is available to everyone. If you log in, your progress and scores will be saved so you can continue later.

Menu

Security Reviews And Threat Modeling

Table of Contents

Why this matters

Who this is for

Prerequisites

Concept explained simply

Mental model

A simple step-by-step process

Worked examples

How to run a security review meeting

Templates you can copy

Exercises

Common mistakes and how to self-check

Practical projects

Learning path

Mini challenge

Quick test and progress

Practice Exercises

Map a data pipeline and enumerate threats (STRIDE + LINDDUN)

Instructions

Expected Output

Risk rating and decisions with owners

Security Reviews And Threat Modeling — Quick Test

Have questions about Security Reviews And Threat Modeling?

AI Assistant