How to learn Transactions and Isolation Levels Acid Locks for SQL in Data Analyst for free

Why this matters

As a Data Analyst, you often read from live databases, create summary tables, and sometimes run batch updates. Without understanding transactions, isolation levels, and locks, your numbers can change mid-query, you can double-count, or you can block teammates’ work. Mastering these basics gives you reliable reports and safe data changes.

Run consistent reports while data is changing.
Prevent double-counting during batch upserts.
Avoid long locks that slow down dashboards or apps.

Who this is for

Analysts who query production or analytics databases and want consistent results, plus anyone who writes occasional update/insert scripts safely.

Prerequisites

Comfort with SELECT, WHERE, GROUP BY, and JOIN.
Basic understanding of primary keys and indexes.
Ability to run multi-statement SQL in your SQL client.

Concept explained simply

A transaction is a box around one or more SQL statements that either all succeed or all fail. Isolation level decides how much your transaction is protected from others running at the same time. Locks are how the database coordinates access so data stays correct.

ACID in one minute

Atomicity: all-or-nothing.
Consistency: rules stay valid before and after.
Isolation: transactions don’t step on each other’s toes.
Durability: once committed, it stays even after crashes.

Mental model

Imagine two editors working on the same document. A transaction is a “Save Draft” session. Isolation level is how much you see the other editor’s changes while you’re drafting. Locks are the “Do Not Disturb” signs that temporarily reserve words or paragraphs.

Isolation levels at a glance

READ UNCOMMITTED: may see uncommitted changes (dirty reads). Rarely used.
READ COMMITTED: no dirty reads; each statement sees committed data at its start; can get different results across statements (non-repeatable reads, phantoms possible).
REPEATABLE READ: same rows read stay the same for the whole transaction; prevents non-repeatable reads. Phantom reads may still occur depending on the engine.
SERIALIZABLE: behaves as if transactions ran one-by-one. Safest, but can be the slowest and may cause more contention.

Common anomalies

Dirty read: you see uncommitted changes of others.
Non-repeatable read: re-reading the same row gives different values in the same transaction.
Phantom read: a re-run of a range query returns new rows that appeared meanwhile.

Locks in brief

Shared (S): for reading; can coexist with other shared locks.
Exclusive (X): for writing; blocks others from reading or writing depending on engine settings.
Granularity: row, page, partition, table. Engines try to keep locks as fine-grained as possible.
Note: Some engines use MVCC so readers don’t block writers; specifics vary by database.

Worked examples

Example 1 — Consistent snapshot reporting

Goal: calculate daily revenue while new orders may be inserted.

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

-- All reads below see a stable snapshot
SELECT DATE(created_at) AS d, SUM(amount) AS total
FROM orders
WHERE status = 'paid' AND created_at >= DATE '2024-12-01' AND created_at < DATE '2024-12-02'
GROUP BY DATE(created_at);

COMMIT;

Why: REPEATABLE READ ensures the same snapshot for all SELECTs in the transaction. Your totals won’t shift if new orders arrive mid-run.

Example 2 — Safe upsert into summary table

Goal: compute total for the day and write it once without double-counting on retries.

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

WITH s AS (
  SELECT DATE(created_at) AS d, COALESCE(SUM(amount), 0) AS total
  FROM orders
  WHERE status = 'paid' AND created_at >= DATE '2024-12-01' AND created_at < DATE '2024-12-02'
  GROUP BY DATE(created_at)
)
MERGE INTO daily_sales ds
USING s
ON ds.sales_date = s.d
WHEN MATCHED THEN UPDATE SET total_amount = s.total, updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN INSERT (sales_date, total_amount, updated_at)
VALUES (s.d, s.total, CURRENT_TIMESTAMP);

COMMIT;

If MERGE is not available

-- Postgres example
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
WITH s AS (
  SELECT DATE(created_at) AS d, COALESCE(SUM(amount), 0) AS total
  FROM orders
  WHERE status = 'paid' AND created_at >= DATE '2024-12-01' AND created_at < DATE '2024-12-02'
  GROUP BY DATE(created_at)
)
INSERT INTO daily_sales (sales_date, total_amount, updated_at)
SELECT d, total, CURRENT_TIMESTAMP FROM s
ON CONFLICT (sales_date)
DO UPDATE SET total_amount = EXCLUDED.total, updated_at = CURRENT_TIMESTAMP;
COMMIT;

Example 3 — Lock rows you intend to update

Goal: adjust a user’s monthly summary after a correction, avoiding races.

BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

-- Lock the specific summary row to avoid concurrent updates
SELECT * FROM monthly_user_summary
WHERE user_id = 42 AND month = DATE '2024-12-01'
FOR UPDATE;

UPDATE monthly_user_summary
SET total_amount = total_amount + 15.00
WHERE user_id = 42 AND month = DATE '2024-12-01';

COMMIT;

Why: SELECT ... FOR UPDATE obtains a lock on the target rows, preventing conflicting updates until you commit.

Two-session walk-through (phantom read demo)

Session A: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SELECT COUNT(*) FROM orders WHERE amount >= 100;
Session B: INSERT a new order with amount 150; COMMIT;
Session A: Re-run the same SELECT; the count may increase (phantom) because READ COMMITTED takes a fresh snapshot each statement.

Using REPEATABLE READ would keep the count stable within A’s transaction. SERIALIZABLE would also prevent conflicting phantoms but with more contention.

Practical projects

Project 1: Build a daily sales aggregator that can be re-run safely (idempotent) using REPEATABLE READ + upsert.
Project 2: Create a data fix script to correct historical rows, using SELECT ... FOR UPDATE to avoid concurrent edits.
Project 3: Benchmark READ COMMITTED vs REPEATABLE READ for a heavy reporting query; compare stability of results and query times.

Exercises

Do both exercises. You can reveal solutions after attempting. Tip: Run them in a sandbox or read carefully if you cannot execute SQL now.

Exercise 1 — Build a safe daily sales updater

Tables:

orders(id, created_at, status, amount)
daily_sales(sales_date PRIMARY KEY, total_amount, updated_at)

Task: In a single transaction, compute the total paid sales for 2024-12-01 and upsert it into daily_sales. Use REPEATABLE READ to get a stable snapshot. Make it safe to re-run without double-counting.

Expected result: A single row for 2024-12-01 is created or updated with the correct total; reruns do not add extra amounts.

Hints

BEGIN; then SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Use a CTE to compute SUM(amount) for the day with status='paid'.
Use MERGE (or UPSERT/ON CONFLICT) to write the result.
COMMIT when done.

Show solution

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

WITH s AS (
  SELECT DATE(created_at) AS d, COALESCE(SUM(amount), 0) AS total
  FROM orders
  WHERE status = 'paid' AND created_at >= DATE '2024-12-01' AND created_at < DATE '2024-12-02'
  GROUP BY DATE(created_at)
)
MERGE INTO daily_sales ds
USING s
ON ds.sales_date = s.d
WHEN MATCHED THEN UPDATE SET total_amount = s.total, updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN INSERT (sales_date, total_amount, updated_at)
VALUES (s.d, s.total, CURRENT_TIMESTAMP);

COMMIT;

Alternative (Postgres): use INSERT ... ON CONFLICT (sales_date) DO UPDATE.

Exercise 2 — Stop non-repeatable reads in a balance check

Table: accounts(id, balance)

Task: Simulate two sessions. Session A must read the same balance for id=7 both before and after Session B updates it. Write the exact commands for Session A and Session B so that A sees consistent results within its transaction.

Expected result: Session A sees the same value twice (e.g., 100 then 100). Session B commits an update (e.g., 120). A only sees the new value after it commits and reads again in a new transaction.

Hints

Use REPEATABLE READ for Session A.
Do a SELECT in A, then perform UPDATE+COMMIT in B, then SELECT again in A (before A commits).
A final check in A after COMMIT should show the new value.

Show solution

-- Session A
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 7; -- suppose 100

-- Session B (separate connection)
BEGIN;
UPDATE accounts SET balance = 120 WHERE id = 7;
COMMIT;

-- Back to Session A (still same transaction)
SELECT balance FROM accounts WHERE id = 7; -- still 100 under REPEATABLE READ
COMMIT;

-- Session A, new transaction
BEGIN;
SELECT balance FROM accounts WHERE id = 7; -- now 120
COMMIT;

Exercise checklist

You wrapped multi-statement work in BEGIN/COMMIT.
You used REPEATABLE READ where a stable snapshot mattered.
Your upsert pattern is idempotent (can be safely re-run).
You avoided long-running transactions when not needed.

Common mistakes and self-check

Mistake: Running long ad-hoc transactions at SERIALIZABLE by default. Fix: Use the lowest isolation that still guarantees correctness (often READ COMMITTED for simple lookups, REPEATABLE READ for multi-step reporting).
Mistake: Forgetting COMMIT/ROLLBACK and leaving locks open. Self-check: After practice, run a quick “show active sessions/transactions” in your DB client and confirm none are idle in transaction.
Mistake: Recomputing and INSERTing totals without idempotency. Fix: Use MERGE/UPSERT keyed by date or dimension IDs.
Mistake: Assuming SELECT never blocks. Reality: Depends on engine and isolation; use appropriate isolation or MVCC settings if needed.
Mistake: Mixing reads and writes to the same rows in different transactions concurrently. Fix: SELECT ... FOR UPDATE for rows you intend to modify.

Learning path

Before this: SQL joins and aggregation; primary keys and indexes.
Now: Transactions, isolation, and locks (this lesson).
Next: Idempotent ETL patterns, scheduling, and monitoring; performance basics (indexes, query plans) to reduce lock time.

Next steps

Take the Quick Test below to validate understanding. Everyone can take it; only logged-in users have progress saved.
Apply REPEATABLE READ in your next reporting job that spans multiple statements.
Refactor one write script to use MERGE/UPSERT and be idempotent.

Mini challenge

Write one transaction that: (1) reads all paid orders for yesterday, (2) updates a daily summary table with the exact total, and (3) cannot double-count if the job is retried. Choose the lowest isolation that still guarantees consistent results and explain why.

Menu

Transactions and Isolation Levels Acid Locks

Table of Contents

Why this matters

Who this is for

Prerequisites

Concept explained simply

Isolation levels at a glance

Worked examples

Example 1 — Consistent snapshot reporting

Example 2 — Safe upsert into summary table

Example 3 — Lock rows you intend to update

Practical projects

Exercises

Exercise 1 — Build a safe daily sales updater

Exercise 2 — Stop non-repeatable reads in a balance check

Exercise checklist

Common mistakes and self-check

Learning path

Next steps

Mini challenge

Practice Exercises

Build a safe daily sales updater

Instructions

Expected Output

Stop non-repeatable reads in a balance check

Transactions and Isolation Levels Acid Locks — Quick Test

Have questions about Transactions and Isolation Levels Acid Locks?

AI Assistant