Why this matters
As a Data Analyst, you often need tables you can trust and evolve. DDL (Data Definition Language) helps you:
- Create clean staging or sandbox tables for analysis.
- Alter table structure when requirements change (rename columns, add fields, add constraints).
- Drop temporary or obsolete tables to keep the workspace tidy and fast.
Concept explained simply
DDL is how you define and change database objects such as tables. The three essentials:
CREATE: make a new table (its name, columns, types, and constraints).ALTER: change an existing table (add/rename/drop columns, modify data types, add constraints).DROP: remove a table entirely.
Mental model
- Think of
CREATEas drafting a blueprint and building the frame. ALTERis remodeling: add a new room (column), widen a door (increase column length), put rules on usage (constraints).DROPis demolition: remove the building when it’s no longer useful.
Core syntax you will use often
CREATE TABLE (with safe guards)
-- Create only if the table does not exist
CREATE TABLE IF NOT EXISTS analytics_sandbox.products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100) NOT NULL,
category VARCHAR(50),
price DECIMAL(10,2) NOT NULL DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT price_nonneg CHECK (price >= 0)
);
Notes: IF NOT EXISTS avoids errors if the table is already there. Default values and checks protect data quality.
ALTER TABLE (common operations)
-- Add a column with a default
ALTER TABLE analytics_sandbox.products
ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT TRUE;
-- Rename a column
-- PostgreSQL:
ALTER TABLE analytics_sandbox.products RENAME COLUMN product_name TO name;
-- Change data type (widen size)
ALTER TABLE analytics_sandbox.products
ALTER COLUMN category TYPE VARCHAR(80);
-- Add a UNIQUE constraint
ALTER TABLE analytics_sandbox.products
ADD CONSTRAINT uq_products_name UNIQUE (name);
Tip: Widening a type (e.g., VARCHAR(50) to VARCHAR(80)) is usually safe. Narrowing can fail if data does not fit.
DROP TABLE (safe patterns)
-- Safely remove a table if it exists
DROP TABLE IF EXISTS analytics_sandbox.products_backup;
-- If dependencies exist, some systems support CASCADE
-- Use carefully: it can drop related objects
-- DROP TABLE IF EXISTS analytics_sandbox.products CASCADE;
Always double-check the target table and understand dependencies before dropping.
Dialect notes (PostgreSQL, MySQL, SQL Server)
- PostgreSQL: supports
IF NOT EXISTSfor CREATE andDROP ... IF EXISTS;ALTER COLUMN TYPEis common; boolean isBOOLEAN. - MySQL: supports
IF NOT EXISTSandDROP ... IF EXISTS; boolean is stored asTINYINT(1); check constraints are enforced in recent versions. - SQL Server: use
IF OBJECT_ID('schema.table','U') IS NULLchecks before CREATE; drop withDROP TABLE IF EXISTS schema.table(supported in newer versions); change types withALTER COLUMN; boolean isBIT.
Worked examples
Example 1 — Create a clean product table
- Choose a safe schema (e.g.,
analytics_sandbox). - Run:
CREATE TABLE IF NOT EXISTS analytics_sandbox.products (
product_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
category VARCHAR(50),
price DECIMAL(10,2) NOT NULL DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT price_nonneg CHECK (price >= 0)
);
Result: Table exists with safe defaults and a price check.
Example 2 — Evolve structure with ALTER
- Add a new flag and widen a column to reduce future errors.
ALTER TABLE analytics_sandbox.products
ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT TRUE;
ALTER TABLE analytics_sandbox.products
ALTER COLUMN category TYPE VARCHAR(80);
Result: Column added with default; category widened safely.
Example 3 — Drop a temporary table safely
- Remove yesterday's temp aggregation table.
DROP TABLE IF EXISTS analytics_sandbox.tmp_sales_agg;
Result: Temp table is gone if it existed; no error otherwise.
Safety first
- Work in a sandbox or dev environment. Never experiment on production tables.
- Prefer widening over narrowing column types.
- Use
IF NOT EXISTSandDROP ... IF EXISTSto avoid breaking runs. - Before
DROP, confirm you target the correct schema and table name. - Consider transactions if your system supports them: if an ALTER fails, you can roll back.
Quick safety checklist
- Do I have a recent backup or can I recreate this table?
- Am I in the correct schema/environment?
- Could this change break downstream queries or dashboards?
- If narrowing a type, did I check for data that might not fit?
Exercises
Tip: Everyone can do the exercises. Saved progress is available for logged-in users.
- Before you start: pick or create a sandbox schema you can safely change.
- Mirror of Exercises ex1–ex3 below. Try first, then open the solution.
Exercise ex1 — Create a table with constraints
Create a table analytics_sandbox.customers with:
customer_id INTprimary keyemail VARCHAR(120)not null and uniquesign_up_date DATEnot nullstatus VARCHAR(20)default 'active'
Show a small checklist
- Used
IF NOT EXISTS? - Included UNIQUE on email?
- Provided sensible defaults?
Exercise ex2 — Alter the table safely
Alter analytics_sandbox.customers:
- Add
last_login TIMESTAMPwith defaultNULL. - Rename
statustoaccount_status. - Widen
account_statustoVARCHAR(30).
Checklist before running
- Confirm the table name and schema.
- Rename first, then widen type (or vice versa depending on dialect support).
Exercise ex3 — Drop a temp table
Drop analytics_sandbox.tmp_customers_stage if it exists. Ensure your command does not error if it’s already gone.
Common mistakes and self-check
- Forgetting the schema: running against the default schema by accident. Self-check: fully-qualify names (e.g.,
analytics_sandbox.table). - Narrowing types prematurely: e.g.,
VARCHAR(20)for emails. Self-check: sample actual data lengths before narrowing. - Dropping the wrong table: ambiguous names. Self-check: run
SELECT * FROM information_schema.tables(or your DB’s catalog) to confirm target. - Skipping constraints: missing UNIQUE/NOT NULL leads to dirty data. Self-check: list required rules and translate them into constraints.
Practical projects
- Project 1: Build a sandbox schema. Create
orders,order_items, andcustomerstables with primary/foreign keys and basic checks. - Project 2: Model evolution. Start with a minimal
productstable. As requirements arise (tags, active flag, longer names), applyALTER TABLEchanges and keep a simple change log. - Project 3: Cleanup job. Identify obsolete temp tables older than 30 days and write safe
DROP TABLE IF EXISTSscripts to tidy up.
Who this is for
- Data Analysts who need to create and evolve reliable tables.
- BI Analysts preparing clean structures for dashboards.
- Anyone moving from ad-hoc SELECTs to managed analytical datasets.
Prerequisites
- Basic SELECT skills and comfort with schemas/databases.
- Access to a safe environment (sandbox or dev database).
- Basic understanding of common types: INT, VARCHAR, DATE/TIMESTAMP, DECIMAL.
Learning path
- This: DDL Create/Alter/Drop (table structures).
- Next: Constraints and Indexes (PK, FK, UNIQUE, CHECK, performance basics).
- Then: DML Insert/Update/Delete and CTAS (populate and reshape data).
- Later: Views and Materialized Views (share curated datasets safely).
Next steps
- Apply DDL to one real dataset you use weekly. Evolve its table safely.
- Create a short checklist you run before any
ALTERorDROP. - Optionally, keep a text file of DDL changes (a simple change log) to track evolution.
Quick Test (save progress when logged in)
Everyone can take the quick test; logged-in users will have progress saved.
Mini challenge
Your stakeholder wants to add product tags and ensure emails are unique in customers. Draft the minimal set of DDL statements to:
- Add a
tags VARCHAR(200)column toanalytics_sandbox.products(nullable is fine). - Ensure
analytics_sandbox.customers.emailis unique (add a constraint if missing). - Safely remove
analytics_sandbox.tmp_sales_agg_oldif it still exists.
Aim for safe patterns (IF EXISTS/IF NOT EXISTS) and future-proof types.