luvv to helpDiscover the Best Free Online Tools
Topic 8 of 31

DML Insert Update Delete Merge

Learn DML Insert Update Delete Merge for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you often fix data quality issues, refresh reference tables, and prepare datasets for reporting. DML (Data Manipulation Language) lets you do that safely and efficiently:

  • Insert: load new records (e.g., new customers from a landing table).
  • Update: correct values (e.g., mark shipped orders, fix category mappings).
  • Delete: remove bad data (e.g., test rows or duplicates).
  • Merge: upsert in one step—insert new rows, update existing rows.
Real tasks you might do
  • Append monthly transactions into a fact table.
  • Update customer segments after a new scoring model.
  • Delete duplicate signups created by a bug.
  • Merge product price changes from a vendor feed.

Who this is for

  • Data Analysts who already write SELECT queries and want to change data safely.
  • Anyone tasked with preparing or correcting tables for dashboards, reports, or models.

Prerequisites

  • Comfort with SELECT, WHERE, JOIN, and GROUP BY.
  • Basic understanding of primary keys and constraints (unique, not null).
  • Know your environment: do you have permission to write to tables? Is auto-commit on?

Concept explained simply

DML are verbs for rows:

  • INSERT adds new rows.
  • UPDATE changes existing rows.
  • DELETE removes rows.
  • MERGE decides row-by-row whether to UPDATE or INSERT (and sometimes DELETE).

Almost always, you combine them with a WHERE or a join condition so you modify exactly the rows you intend.

Mental model

Imagine a spreadsheet with a unique key column. INSERT adds new lines, UPDATE edits cells for matching keys, DELETE removes lines, MERGE reads a change list and applies each row: if the key exists, edit; if not, add.

Safety first: transactions and WHERE

  • Preview first: run a SELECT that returns the same rows you intend to UPDATE or DELETE.
  • Start a transaction when possible:
    BEGIN;
    -- your UPDATE/DELETE
    -- SELECT to verify results
    ROLLBACK; -- or COMMIT when you're 100% sure
  • Always include a WHERE (or matching ON) when changing data. No WHERE = change all rows.
  • If supported, use RETURNING to see what changed.
Dialect notes
  • MERGE is supported in many systems. Some engines use alternatives like INSERT ... ON CONFLICT/ON DUPLICATE KEY UPDATE for upserts.
  • UPDATE ... FROM syntax varies slightly by database.
  • Always test on a small subset before running on full tables.

Core syntax cheat sheet

-- INSERT
INSERT INTO schema.table (col1, col2)
VALUES (val1, val2);

-- INSERT from SELECT
INSERT INTO target_table (col1, col2)
SELECT col1, col2 FROM source_table;

-- UPDATE (basic)
UPDATE target t
SET col = new_value
WHERE condition;

-- UPDATE with join (typical pattern)
UPDATE target t
SET col = s.new_col
FROM source s
WHERE t.key = s.key;

-- DELETE
DELETE FROM target
WHERE condition;

-- MERGE (ANSI-style)
MERGE INTO target t
USING source s
   ON t.key = s.key
WHEN MATCHED THEN
  UPDATE SET t.col = s.col
WHEN NOT MATCHED THEN
  INSERT (key, col) VALUES (s.key, s.col);
Using RETURNING (when available)
UPDATE orders
SET status = 'shipped'
WHERE shipped_at IS NOT NULL AND status <> 'shipped'
RETURNING id, status;

Worked examples

Example 1 — Insert only new customers

Goal: append customers that are not already in the table by email.

-- Sample tables
CREATE TABLE IF NOT EXISTS customers (
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE,
  name  TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS staging_new_customers (
  email TEXT,
  name  TEXT
);

-- Reset sample data (optional)
DELETE FROM staging_new_customers; DELETE FROM customers;
INSERT INTO customers (email, name) VALUES
 ('alice@example.com','Alice'),
 ('bob@example.com','Bob');
INSERT INTO staging_new_customers (email, name) VALUES
 ('carol@example.com','Carol'),
 ('alice@example.com','Alice A.');

-- Insert only those not present
INSERT INTO customers (email, name)
SELECT s.email, s.name
FROM staging_new_customers s
LEFT JOIN customers c ON c.email = s.email
WHERE c.email IS NULL;

-- Preview results
SELECT email, name FROM customers ORDER BY email;

Result: carol@example.com is inserted; alice@example.com is skipped.

Example 2 — Update shipped orders

Goal: set status='shipped' for orders with a shipped_at timestamp.

CREATE TABLE IF NOT EXISTS orders (
  id INT PRIMARY KEY,
  customer_id INT,
  status TEXT,
  shipped_at TIMESTAMP
);

DELETE FROM orders;
INSERT INTO orders (id, customer_id, status, shipped_at) VALUES
 (1, 1, 'pending', '2023-01-02'),
 (2, 2, 'pending', '2023-01-03'),
 (3, 1, 'pending', NULL);

UPDATE orders
SET status = 'shipped'
WHERE shipped_at IS NOT NULL
  AND status <> 'shipped';

SELECT id, status FROM orders ORDER BY id;

Result: rows 1 and 2 become shipped; row 3 remains pending.

Example 3 — Delete duplicates, keep the smallest id

Goal: remove duplicate emails, keeping the earliest id.

WITH ranked AS (
  SELECT id, email,
         ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
  FROM customers
)
DELETE FROM customers
USING ranked r
WHERE customers.id = r.id
  AND r.rn > 1;

Tip: Preview with the CTE SELECT first to see which rows have rn > 1.

Example 4 — MERGE product prices (upsert)

Goal: update existing product prices and insert new SKUs.

CREATE TABLE IF NOT EXISTS products (
  sku TEXT PRIMARY KEY,
  name TEXT,
  price NUMERIC(10,2)
);
CREATE TABLE IF NOT EXISTS staging_prices (
  sku TEXT,
  price NUMERIC(10,2)
);

DELETE FROM products; DELETE FROM staging_prices;
INSERT INTO products (sku, name, price) VALUES
 ('P1','Pen',10.00),
 ('P2','Notebook',15.00);
INSERT INTO staging_prices (sku, price) VALUES
 ('P2',17.00),
 ('P3',12.00);

MERGE INTO products p
USING staging_prices s
  ON p.sku = s.sku
WHEN MATCHED THEN
  UPDATE SET price = s.price
WHEN NOT MATCHED THEN
  INSERT (sku, name, price) VALUES (s.sku, COALESCE(p.name,'New'), s.price);

SELECT sku, price FROM products ORDER BY sku;

Result: P2 price updated to 17.00; P3 inserted at 12.00.

Exercises

Use these small schemas; run the setup in each exercise once. Then attempt the task. Check off steps as you go.

  • I ran the setup statements.
  • I previewed affected rows with SELECT before changing data.
  • I wrapped changes in a transaction (if supported) and verified results.

Exercise 1 — Insert new customers without duplicates (EX1)

Setup
CREATE TABLE IF NOT EXISTS customers (
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE,
  name  TEXT
);
CREATE TABLE IF NOT EXISTS staging_new_customers (
  email TEXT,
  name  TEXT
);
DELETE FROM customers; DELETE FROM staging_new_customers;
INSERT INTO customers (email, name) VALUES
 ('alice@example.com','Alice'),
 ('bob@example.com','Bob');
INSERT INTO staging_new_customers (email, name) VALUES
 ('carol@example.com','Carol'),
 ('alice@example.com','Alice New');

Task: Insert from staging_new_customers into customers but only for emails not already present. Return inserted rows if possible.

Hint
  • Think LEFT JOIN ... WHERE target.email IS NULL or NOT EXISTS.
  • RETURNING can show what was inserted.

Exercise 2 — Update order status to shipped (EX2)

Setup
CREATE TABLE IF NOT EXISTS orders (
  id INT PRIMARY KEY,
  customer_id INT,
  status TEXT,
  shipped_at TIMESTAMP
);
DELETE FROM orders;
INSERT INTO orders (id, customer_id, status, shipped_at) VALUES
 (1, 1, 'pending', '2023-01-02'),
 (2, 2, 'pending', '2023-01-03'),
 (3, 1, 'pending', NULL);

Task: Update orders so that any row with shipped_at IS NOT NULL has status='shipped', but avoid re-updating already 'shipped' rows.

Hint
  • WHERE shipped_at IS NOT NULL AND status <> 'shipped'
  • Add RETURNING if supported to verify.

Exercise 3 — MERGE product prices (EX3)

Setup
CREATE TABLE IF NOT EXISTS products (
  sku TEXT PRIMARY KEY,
  name TEXT,
  price NUMERIC(10,2)
);
CREATE TABLE IF NOT EXISTS staging_prices (
  sku TEXT,
  price NUMERIC(10,2)
);
DELETE FROM products; DELETE FROM staging_prices;
INSERT INTO products (sku, name, price) VALUES
 ('P1','Pen',10.00),
 ('P2','Notebook',15.00);
INSERT INTO staging_prices (sku, price) VALUES
 ('P2',17.00),
 ('P3',12.00);

Task: MERGE staging_prices into products to update price for matching SKUs and insert missing SKUs. Preview results with SELECT afterward.

Hint
  • ON p.sku = s.sku; WHEN MATCHED UPDATE; WHEN NOT MATCHED INSERT
  • Set a default name like 'New' when inserting.

Common mistakes and self-check

  • No WHERE on UPDATE/DELETE: use SELECT preview to ensure the row count is what you expect.
  • Wrong join in UPDATE/MERGE: verify the join keys are unique; run a SELECT count to detect duplicates.
  • Overwriting with NULLs: in UPDATE, use COALESCE(s.col, t.col) if the source may have NULLs you don’t want to apply.
  • Violating constraints: check for duplicate keys before INSERT or use upsert patterns.
  • Auto-commit surprises: test with a transaction on a small subset.
Self-check mini-list
  • I can preview affected rows with a SELECT that mimics my WHERE/join.
  • I can explain why my UPDATE won’t multiply rows.
  • I know how to undo changes (ROLLBACK) in my environment.

Practical projects

  • Customer dedup pipeline: load signups into a clean customers table with INSERT-from-SELECT and a NOT EXISTS filter.
  • Order state refresher: nightly UPDATE of order statuses based on event timestamps, with a change log table capturing RETURNING output.
  • Reference price upsert: weekly MERGE vendor prices into products; alert when prices change over 20%.

Mini challenge

Create a table product_categories(sku, category) with a unique constraint on sku. Build a staging table with some new categories and some existing. Write a MERGE that:

  • Updates category for existing SKUs only if the new category is not NULL.
  • Inserts rows for new SKUs when category is provided.
Hint

Use WHEN MATCHED THEN UPDATE SET category = COALESCE(s.category, t.category) and WHEN NOT MATCHED THEN INSERT ... WHERE s.category IS NOT NULL (supported in some dialects) or filter via USING subquery.

Learning path

  • Before: SELECT and JOIN fundamentals; constraints and keys.
  • This lesson: DML—INSERT, UPDATE, DELETE, MERGE; safety patterns.
  • Next: Transactions, isolation levels, and auditing changes; incremental ETL patterns.

Next steps

  • Re-run the worked examples on your own sandbox.
  • Complete the exercises below; then take the quick test.
  • Tip: The test is available to everyone; only logged-in users get saved progress.

Practice Exercises

3 exercises to complete

Instructions

Insert only new customers from staging_new_customers into customers based on email. Preview first with a SELECT, then INSERT. If supported, use RETURNING to show inserted rows.

Setup (run once)
CREATE TABLE IF NOT EXISTS customers (
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE,
  name  TEXT
);
CREATE TABLE IF NOT EXISTS staging_new_customers (
  email TEXT,
  name  TEXT
);
DELETE FROM customers; DELETE FROM staging_new_customers;
INSERT INTO customers (email, name) VALUES
 ('alice@example.com','Alice'),
 ('bob@example.com','Bob');
INSERT INTO staging_new_customers (email, name) VALUES
 ('carol@example.com','Carol'),
 ('alice@example.com','Alice New');
Expected Output
1 row inserted: carol@example.com

DML Insert Update Delete Merge — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about DML Insert Update Delete Merge?

AI Assistant

Ask questions about this tool