How to learn DML Insert Update Delete Merge for SQL in Data Analyst for free

Why this matters

As a Data Analyst, you often fix data quality issues, refresh reference tables, and prepare datasets for reporting. DML (Data Manipulation Language) lets you do that safely and efficiently:

Insert: load new records (e.g., new customers from a landing table).
Update: correct values (e.g., mark shipped orders, fix category mappings).
Delete: remove bad data (e.g., test rows or duplicates).
Merge: upsert in one step—insert new rows, update existing rows.

Real tasks you might do

Append monthly transactions into a fact table.
Update customer segments after a new scoring model.
Delete duplicate signups created by a bug.
Merge product price changes from a vendor feed.

Who this is for

Data Analysts who already write SELECT queries and want to change data safely.
Anyone tasked with preparing or correcting tables for dashboards, reports, or models.

Prerequisites

Comfort with SELECT, WHERE, JOIN, and GROUP BY.
Basic understanding of primary keys and constraints (unique, not null).
Know your environment: do you have permission to write to tables? Is auto-commit on?

Concept explained simply

DML are verbs for rows:

INSERT adds new rows.
UPDATE changes existing rows.
DELETE removes rows.
MERGE decides row-by-row whether to UPDATE or INSERT (and sometimes DELETE).

Almost always, you combine them with a WHERE or a join condition so you modify exactly the rows you intend.

Mental model

Imagine a spreadsheet with a unique key column. INSERT adds new lines, UPDATE edits cells for matching keys, DELETE removes lines, MERGE reads a change list and applies each row: if the key exists, edit; if not, add.

Safety first: transactions and WHERE

Preview first: run a SELECT that returns the same rows you intend to UPDATE or DELETE.

Start a transaction when possible:

BEGIN;
-- your UPDATE/DELETE
-- SELECT to verify results
ROLLBACK; -- or COMMIT when you're 100% sure

Always include a WHERE (or matching ON) when changing data. No WHERE = change all rows.
If supported, use RETURNING to see what changed.

Dialect notes

MERGE is supported in many systems. Some engines use alternatives like INSERT ... ON CONFLICT/ON DUPLICATE KEY UPDATE for upserts.
UPDATE ... FROM syntax varies slightly by database.
Always test on a small subset before running on full tables.

Core syntax cheat sheet

-- INSERT
INSERT INTO schema.table (col1, col2)
VALUES (val1, val2);

-- INSERT from SELECT
INSERT INTO target_table (col1, col2)
SELECT col1, col2 FROM source_table;

-- UPDATE (basic)
UPDATE target t
SET col = new_value
WHERE condition;

-- UPDATE with join (typical pattern)
UPDATE target t
SET col = s.new_col
FROM source s
WHERE t.key = s.key;

-- DELETE
DELETE FROM target
WHERE condition;

-- MERGE (ANSI-style)
MERGE INTO target t
USING source s
   ON t.key = s.key
WHEN MATCHED THEN
  UPDATE SET t.col = s.col
WHEN NOT MATCHED THEN
  INSERT (key, col) VALUES (s.key, s.col);

Using RETURNING (when available)

UPDATE orders
SET status = 'shipped'
WHERE shipped_at IS NOT NULL AND status <> 'shipped'
RETURNING id, status;

Worked examples

Example 1 — Insert only new customers

Goal: append customers that are not already in the table by email.

-- Sample tables
CREATE TABLE IF NOT EXISTS customers (
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE,
  name  TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS staging_new_customers (
  email TEXT,
  name  TEXT
);

-- Reset sample data (optional)
DELETE FROM staging_new_customers; DELETE FROM customers;
INSERT INTO customers (email, name) VALUES
 ('alice@example.com','Alice'),
 ('bob@example.com','Bob');
INSERT INTO staging_new_customers (email, name) VALUES
 ('carol@example.com','Carol'),
 ('alice@example.com','Alice A.');

-- Insert only those not present
INSERT INTO customers (email, name)
SELECT s.email, s.name
FROM staging_new_customers s
LEFT JOIN customers c ON c.email = s.email
WHERE c.email IS NULL;

-- Preview results
SELECT email, name FROM customers ORDER BY email;

Result: carol@example.com is inserted; alice@example.com is skipped.

Example 2 — Update shipped orders

Goal: set status='shipped' for orders with a shipped_at timestamp.

CREATE TABLE IF NOT EXISTS orders (
  id INT PRIMARY KEY,
  customer_id INT,
  status TEXT,
  shipped_at TIMESTAMP
);

DELETE FROM orders;
INSERT INTO orders (id, customer_id, status, shipped_at) VALUES
 (1, 1, 'pending', '2023-01-02'),
 (2, 2, 'pending', '2023-01-03'),
 (3, 1, 'pending', NULL);

UPDATE orders
SET status = 'shipped'
WHERE shipped_at IS NOT NULL
  AND status <> 'shipped';

SELECT id, status FROM orders ORDER BY id;

Result: rows 1 and 2 become shipped; row 3 remains pending.

Example 3 — Delete duplicates, keep the smallest id

Goal: remove duplicate emails, keeping the earliest id.

WITH ranked AS (
  SELECT id, email,
         ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
  FROM customers
)
DELETE FROM customers
USING ranked r
WHERE customers.id = r.id
  AND r.rn > 1;

Tip: Preview with the CTE SELECT first to see which rows have rn > 1.

Example 4 — MERGE product prices (upsert)

Goal: update existing product prices and insert new SKUs.

CREATE TABLE IF NOT EXISTS products (
  sku TEXT PRIMARY KEY,
  name TEXT,
  price NUMERIC(10,2)
);
CREATE TABLE IF NOT EXISTS staging_prices (
  sku TEXT,
  price NUMERIC(10,2)
);

DELETE FROM products; DELETE FROM staging_prices;
INSERT INTO products (sku, name, price) VALUES
 ('P1','Pen',10.00),
 ('P2','Notebook',15.00);
INSERT INTO staging_prices (sku, price) VALUES
 ('P2',17.00),
 ('P3',12.00);

MERGE INTO products p
USING staging_prices s
  ON p.sku = s.sku
WHEN MATCHED THEN
  UPDATE SET price = s.price
WHEN NOT MATCHED THEN
  INSERT (sku, name, price) VALUES (s.sku, COALESCE(p.name,'New'), s.price);

SELECT sku, price FROM products ORDER BY sku;

Result: P2 price updated to 17.00; P3 inserted at 12.00.

Exercises

Use these small schemas; run the setup in each exercise once. Then attempt the task. Check off steps as you go.

I ran the setup statements.
I previewed affected rows with SELECT before changing data.
I wrapped changes in a transaction (if supported) and verified results.

Exercise 1 — Insert new customers without duplicates (EX1)

Setup

CREATE TABLE IF NOT EXISTS customers (
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE,
  name  TEXT
);
CREATE TABLE IF NOT EXISTS staging_new_customers (
  email TEXT,
  name  TEXT
);
DELETE FROM customers; DELETE FROM staging_new_customers;
INSERT INTO customers (email, name) VALUES
 ('alice@example.com','Alice'),
 ('bob@example.com','Bob');
INSERT INTO staging_new_customers (email, name) VALUES
 ('carol@example.com','Carol'),
 ('alice@example.com','Alice New');

Task: Insert from staging_new_customers into customers but only for emails not already present. Return inserted rows if possible.

Hint

Think LEFT JOIN ... WHERE target.email IS NULL or NOT EXISTS.
RETURNING can show what was inserted.

Exercise 2 — Update order status to shipped (EX2)

Setup

CREATE TABLE IF NOT EXISTS orders (
  id INT PRIMARY KEY,
  customer_id INT,
  status TEXT,
  shipped_at TIMESTAMP
);
DELETE FROM orders;
INSERT INTO orders (id, customer_id, status, shipped_at) VALUES
 (1, 1, 'pending', '2023-01-02'),
 (2, 2, 'pending', '2023-01-03'),
 (3, 1, 'pending', NULL);

Task: Update orders so that any row with shipped_at IS NOT NULL has status='shipped', but avoid re-updating already 'shipped' rows.

Hint

WHERE shipped_at IS NOT NULL AND status <> 'shipped'
Add RETURNING if supported to verify.

Exercise 3 — MERGE product prices (EX3)

Setup

CREATE TABLE IF NOT EXISTS products (
  sku TEXT PRIMARY KEY,
  name TEXT,
  price NUMERIC(10,2)
);
CREATE TABLE IF NOT EXISTS staging_prices (
  sku TEXT,
  price NUMERIC(10,2)
);
DELETE FROM products; DELETE FROM staging_prices;
INSERT INTO products (sku, name, price) VALUES
 ('P1','Pen',10.00),
 ('P2','Notebook',15.00);
INSERT INTO staging_prices (sku, price) VALUES
 ('P2',17.00),
 ('P3',12.00);

Task: MERGE staging_prices into products to update price for matching SKUs and insert missing SKUs. Preview results with SELECT afterward.

Hint

ON p.sku = s.sku; WHEN MATCHED UPDATE; WHEN NOT MATCHED INSERT
Set a default name like 'New' when inserting.

Common mistakes and self-check

No WHERE on UPDATE/DELETE: use SELECT preview to ensure the row count is what you expect.
Wrong join in UPDATE/MERGE: verify the join keys are unique; run a SELECT count to detect duplicates.
Overwriting with NULLs: in UPDATE, use COALESCE(s.col, t.col) if the source may have NULLs you don’t want to apply.
Violating constraints: check for duplicate keys before INSERT or use upsert patterns.
Auto-commit surprises: test with a transaction on a small subset.

Self-check mini-list

I can preview affected rows with a SELECT that mimics my WHERE/join.
I can explain why my UPDATE won’t multiply rows.
I know how to undo changes (ROLLBACK) in my environment.

Practical projects

Customer dedup pipeline: load signups into a clean customers table with INSERT-from-SELECT and a NOT EXISTS filter.
Order state refresher: nightly UPDATE of order statuses based on event timestamps, with a change log table capturing RETURNING output.
Reference price upsert: weekly MERGE vendor prices into products; alert when prices change over 20%.

Mini challenge

Create a table product_categories(sku, category) with a unique constraint on sku. Build a staging table with some new categories and some existing. Write a MERGE that:

Updates category for existing SKUs only if the new category is not NULL.
Inserts rows for new SKUs when category is provided.

Hint

Use WHEN MATCHED THEN UPDATE SET category = COALESCE(s.category, t.category) and WHEN NOT MATCHED THEN INSERT ... WHERE s.category IS NOT NULL (supported in some dialects) or filter via USING subquery.

Learning path

Before: SELECT and JOIN fundamentals; constraints and keys.
This lesson: DML—INSERT, UPDATE, DELETE, MERGE; safety patterns.
Next: Transactions, isolation levels, and auditing changes; incremental ETL patterns.

Next steps

Re-run the worked examples on your own sandbox.
Complete the exercises below; then take the quick test.
Tip: The test is available to everyone; only logged-in users get saved progress.

Menu

DML Insert Update Delete Merge

Table of Contents

Why this matters

Who this is for

Prerequisites

Concept explained simply

Mental model

Safety first: transactions and WHERE

Core syntax cheat sheet

Worked examples

Exercises

Exercise 1 — Insert new customers without duplicates (EX1)

Exercise 2 — Update order status to shipped (EX2)

Exercise 3 — MERGE product prices (EX3)

Common mistakes and self-check

Practical projects

Mini challenge

Learning path

Next steps

Practice Exercises

Insert new customers without duplicates

Instructions

Expected Output

Update order status to shipped

MERGE product prices (upsert)

DML Insert Update Delete Merge — Quick Test

Have questions about DML Insert Update Delete Merge?

AI Assistant