luvv to helpDiscover the Best Free Online Tools
Topic 1 of 12

Understanding Tables And Schemas

Learn Understanding Tables And Schemas for free with explanations, exercises, and a quick test (for Business Analyst).

Published: December 20, 2025 | Updated: December 20, 2025

Why this matters

Business Analysts regularly answer questions like “Which customers churned last month?” or “What is the refund rate by product?”. To do this confidently, you must know where data lives (schemas), how it is organized (tables), and how tables relate (keys and constraints). Understanding these basics helps you:

  • Find the right tables quickly without guessing.
  • Join data safely and avoid incorrect counts or duplicates.
  • Read ERDs and data dictionaries to design accurate queries.
  • Communicate clearly with engineers and analysts using shared terms.

Concept explained simply

Think of a database like a city:

  • Database = the city.
  • Schema = a district (e.g., sales, marketing, finance).
  • Table = a building for one thing (e.g., customers, orders).
  • Column = a room storing a specific kind of info (e.g., email, created_at).
  • Row = a single record (one customer, one order).

Primary Key (PK): a column (or set of columns) uniquely identifying a row. Often named id or table_name_id.

Foreign Key (FK): a column pointing to another table’s PK, linking related records (orders.customer_id → customers.customer_id).

Constraints: rules a table enforces (NOT NULL, UNIQUE, CHECK, PRIMARY KEY, FOREIGN KEY). They keep data clean and relationships sound.

Mental model

  • Schemas group related tables and control access.
  • Each table is about one concept; columns are attributes of that concept.
  • Relationships are declared with keys; most analysis requires joining across them.
  • Data types matter: dates behave differently from text; numbers from strings.
  • NULL means “unknown/missing”, not zero or empty string.
How to view schemas and tables (common approaches)

Many systems support INFORMATION_SCHEMA views:

  • List tables in a schema: SELECT table_name FROM information_schema.tables WHERE table_schema = 'sales';
  • List columns in a table: SELECT column_name, data_type, is_nullable FROM information_schema.columns WHERE table_schema = 'sales' AND table_name = 'customers';

Other handy commands by engine:

  • PostgreSQL: \dt sales.* in psql; information_schema as above.
  • MySQL: SHOW TABLES FROM your_db; SHOW COLUMNS FROM sales.customers;
  • SQL Server: SELECT name FROM sys.tables WHERE schema_id = SCHEMA_ID('sales');
  • SQLite: PRAGMA table_info(customers); SELECT name FROM sqlite_master WHERE type='table';

Worked examples

Example 1 — Find the right tables for a KPI

Goal: “Monthly orders and revenue.”

  1. Identify tables: orders (order_id, customer_id, order_date, status), order_items (order_id, product_id, quantity, unit_price).
  2. Join logic: orders.order_id = order_items.order_id.
  3. Aggregation: revenue per month = SUM(quantity * unit_price) grouped by month(order_date).
Sample SQL (general idea)
SELECT DATE_TRUNC('month', o.order_date) AS month,
       COUNT(DISTINCT o.order_id) AS orders,
       SUM(oi.quantity * oi.unit_price) AS revenue
FROM sales.orders o
JOIN sales.order_items oi ON oi.order_id = o.order_id
WHERE o.status = 'completed'
GROUP BY 1
ORDER BY 1;

If DATE_TRUNC is unavailable, group by YEAR(order_date), MONTH(order_date) or use equivalent functions.

Example 2 — Check keys before joining customers to orders

We need customer lifetime value. First, confirm keys:

  • customers.customer_id is PRIMARY KEY (unique, not null).
  • orders.customer_id is a FOREIGN KEY referencing customers.customer_id.
Schema check (using INFORMATION_SCHEMA)
-- Columns and data types
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_schema = 'sales' AND table_name = 'customers';

-- Keys (example joining constraints; some engines vary)
SELECT tc.constraint_type, kcu.table_name, kcu.column_name, ccu.table_name AS ref_table, ccu.column_name AS ref_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
  ON tc.constraint_name = kcu.constraint_name AND tc.table_schema = kcu.table_schema
LEFT JOIN information_schema.constraint_column_usage ccu
  ON ccu.constraint_name = tc.constraint_name AND ccu.table_schema = tc.table_schema
WHERE tc.table_schema = 'sales' AND kcu.table_name IN ('customers','orders');

Once verified, a safe join for “total spend by customer” is straightforward.

Example 3 — Avoid NULL pitfalls

Goal: Average discount percentage from order_items.discount_pct. If some rows have NULL (no discount stored), using AVG(discount_pct) ignores NULLs, which is usually correct. But if you need missing = 0% discount, coalesce first.

Sample SQL
-- Ignores NULLs
SELECT AVG(discount_pct) AS avg_discount
FROM sales.order_items;

-- Treat NULL as 0
SELECT AVG(COALESCE(discount_pct, 0)) AS avg_discount_including_zeros
FROM sales.order_items;

Hands-on exercises

These mirror the exercises below (Ex1 and Ex2). Run them in your environment or adapt the queries. If you don’t have the exact schema, substitute your database and schema names.

  1. Inventory scan: list all tables in the sales schema and the columns of customers.
  2. Relationship map: list primary keys and foreign keys among customers, orders, order_items, products, and payments.

Exercise checklist

  • I can list all tables in a specific schema.
  • I can list columns, data types, and nullability for a table.
  • I can identify primary keys and foreign keys programmatically.
  • I can explain the join path among core sales tables without guessing.

Common mistakes and self-check

  • Mistake: Assuming a column is unique without checking. Self-check: COUNT(*), COUNT(DISTINCT col) and look for duplicates.
  • Mistake: Using = NULL in filters. Self-check: Replace with IS NULL or IS NOT NULL.
  • Mistake: Joining on names or emails instead of IDs. Self-check: Prefer PK→FK joins; verify indexes and constraints.
  • Mistake: Ignoring schema prefixes (e.g., selecting from the wrong table with the same name in another schema). Self-check: Always qualify as schema.table.
  • Mistake: Forgetting many-to-many needs a bridge table. Self-check: Look for a junction like product_categories(product_id, category_id).

Who this is for

  • Aspiring and junior Business Analysts who need to read and write basic SQL.
  • Non-technical stakeholders who review data models and want reliable numbers.

Prerequisites

  • Very basic SQL SELECT and WHERE.
  • Familiarity with rows, columns, and simple aggregates (COUNT, SUM).

Learning path

  1. Understand schemas, tables, and columns (this lesson).
  2. Learn keys and joins (inner, left, right, full) and when to use each.
  3. Practice filtering and aggregation with GROUP BY and HAVING.
  4. Explore data types, NULL handling, and date/time functions.
  5. Build reusable queries and views for recurring business questions.

Practical projects

  • Sales model map: Diagram the core sales schema tables and their key relationships.
  • Data dictionary snippet: Generate a table/column inventory with data types and nullability for a chosen schema.
  • Join audit: Write queries that verify PK uniqueness and FK referential coverage (percent of orphaned rows).

Next steps

  • Move on to joins and filtering to answer multi-table questions confidently.
  • Automate schema checks as part of your analysis checklist.

Mini challenge

Given tables: customers(customer_id PK), subscriptions(subscription_id PK, customer_id FK, plan_id FK, status, started_at), plans(plan_id PK, plan_name). Task: Return active subscriptions by plan with unique subscriber counts.

Hint

Join subscriptions to plans on plan_id. Filter status = 'active'. Count DISTINCT customer_id by plan.

Quick Test

Anyone can take the test for free. If you are logged in, your progress will be saved automatically.

Practice Exercises

2 exercises to complete

Instructions

Goal: Inventory the sales schema and inspect the customers table.

  1. List all table names in the sales schema.
  2. List column_name, data_type, and is_nullable for sales.customers.

Use INFORMATION_SCHEMA where possible. If unavailable, use your engine’s alternative commands.

Expected Output
Tables (example): customers, orders, order_items, products, payments. Columns (example for customers): customer_id (bigint, NO), first_name (varchar, YES), last_name (varchar, YES), email (varchar, YES), created_at (timestamp, NO).

Understanding Tables And Schemas — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Understanding Tables And Schemas?

AI Assistant

Ask questions about this tool