Menu

Topic 24 of 31

Stored Procedures and Functions

Learn Stored Procedures and Functions for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, you often repeat the same transformations: cleaning text, standardizing dates, calculating KPIs, and rebuilding reporting tables. Stored procedures and functions let you package this logic once and reuse it consistently across dashboards, ad-hoc queries, and scheduled jobs. They reduce errors, speed up analysis, and make your work auditable and shareable.

  • Automate report refreshes (e.g., nightly aggregates)
  • Keep business rules consistent (e.g., revenue buckets, null handling)
  • Speed up analysts: call reusable logic instead of copying queries

Who this is for

  • Data Analysts who write SQL daily and maintain recurring reports
  • Anyone who wants reusable, tested SQL logic for consistent analytics

Prerequisites

  • Comfort with SELECT, WHERE, JOIN, GROUP BY
  • Basic data types (text, numeric, date)
  • Ability to create tables/objects in your database (or a sandbox)

Concept explained simply

  • Function: a reusable expression that returns a value (scalar) or a result set (table-valued, depending on DB). Typically used inside SELECT/WHERE/ORDER BY.
  • Stored procedure: a named routine that executes a sequence of statements; can accept parameters, use transactions, control flow, and often perform write operations.

Mental model

  • Function = calculator: input in, pure output out, often no side effects.
  • Procedure = recipe: a series of steps, can mix reads/writes, parameters, and error handling.

Note: Capabilities vary by database. Many systems restrict functions from writing data, while procedures can manage transactions and modifications.

Core patterns you must know

Pattern 1: Derive a standard date (month start)

Goal: Return the first day of a given date's month for grouping and partitioning.

-- PostgreSQL
CREATE OR REPLACE FUNCTION get_month_start(p_date date)
RETURNS date LANGUAGE sql AS $$
  SELECT date_trunc('month', p_date)::date;
$$;

-- SQL Server
CREATE OR ALTER FUNCTION dbo.get_month_start (@d date)
RETURNS date AS
BEGIN
  RETURN DATEFROMPARTS(YEAR(@d), MONTH(@d), 1);
END;

-- MySQL 8+
DELIMITER //
CREATE FUNCTION get_month_start(p_date date)
RETURNS date DETERMINISTIC
BEGIN
  RETURN DATE_FORMAT(p_date, '%Y-%m-01');
END //
DELIMITER ;

Usage example:

SELECT get_month_start(order_date) AS month_start, SUM(amount)
FROM sales
GROUP BY get_month_start(order_date);
Pattern 2: Safe math (avoid divide-by-zero)

Goal: Return numerator/denominator, but gracefully handle zero/NULL.

-- PostgreSQL
CREATE OR REPLACE FUNCTION safe_divide(n numeric, d numeric)
RETURNS numeric LANGUAGE sql AS $$
  SELECT CASE WHEN d IS NULL OR d = 0 THEN NULL ELSE n / d END;
$$;

-- SQL Server
CREATE OR ALTER FUNCTION dbo.safe_divide (@n decimal(18,6), @d decimal(18,6))
RETURNS decimal(38,6) AS
BEGIN
  RETURN CASE WHEN @d IS NULL OR @d = 0 THEN NULL ELSE @n / @d END;
END;

-- MySQL 8+
DELIMITER //
CREATE FUNCTION safe_divide(n decimal(18,6), d decimal(18,6))
RETURNS decimal(38,6) DETERMINISTIC
BEGIN
  RETURN IF(d IS NULL OR d = 0, NULL, n / d);
END //
DELIMITER ;

Usage example:

SELECT safe_divide(SUM(revenue), NULLIF(SUM(cost), 0)) AS margin_ratio FROM facts;
Pattern 3: Rebuild a reporting table with a procedure

Goal: Clear and repopulate an aggregate table in a controlled transaction.

-- PostgreSQL 11+
CREATE OR REPLACE PROCEDURE rebuild_sales_daily()
LANGUAGE plpgsql AS $$
BEGIN
  BEGIN
    -- Transaction block is implicit in many schedulers; you can also use explicit BEGIN/COMMIT around CALL
    DELETE FROM rpt_sales_daily;
    INSERT INTO rpt_sales_daily(dt, total_orders, total_revenue)
    SELECT order_date, COUNT(*), SUM(amount)
    FROM sales
    GROUP BY order_date;
  EXCEPTION WHEN OTHERS THEN
    RAISE NOTICE 'Rebuild failed: %', SQLERRM;
    RAISE;
  END;
END;$$;

-- SQL Server
CREATE OR ALTER PROCEDURE dbo.rebuild_sales_daily
AS
BEGIN
  SET NOCOUNT ON;
  BEGIN TRY
    BEGIN TRAN;
      DELETE FROM dbo.rpt_sales_daily;
      INSERT INTO dbo.rpt_sales_daily(dt, total_orders, total_revenue)
      SELECT order_date, COUNT(*), SUM(amount)
      FROM dbo.sales
      GROUP BY order_date;
    COMMIT TRAN;
  END TRY
  BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK TRAN;
    THROW;
  END CATCH
END;

-- MySQL 8+
DELIMITER //
CREATE PROCEDURE rebuild_sales_daily()
BEGIN
  START TRANSACTION;
  DELETE FROM rpt_sales_daily;
  INSERT INTO rpt_sales_daily(dt, total_orders, total_revenue)
  SELECT order_date, COUNT(*), SUM(amount)
  FROM sales
  GROUP BY order_date;
  COMMIT;
END //
DELIMITER ;

Usage:

-- PostgreSQL/MySQL
CALL rebuild_sales_daily();
-- SQL Server
EXEC dbo.rebuild_sales_daily;

Quick syntax cheatsheet

  • Create scalar function: CREATE FUNCTION name(params) RETURNS type ...
  • Create procedure: CREATE PROCEDURE name(params) AS/BEGIN ... END
  • Execute function: use in SELECT/WHERE; Execute procedure: CALL/EXEC
  • Error handling: EXCEPTION (Postgres), TRY/CATCH (SQL Server), HANDLER or check conditions (MySQL)

Exercises

Do these to lock in the concepts. Solutions are hidden under toggles. The same exercises appear in the practice panel below.

Exercise 1 — CLEAN_CATEGORY() function

Write a function CLEAN_CATEGORY(text) that:

  • Trims whitespace
  • Converts to UPPERCASE
  • Returns 'UNKNOWN' if input is NULL or becomes empty after trimming

Test cases:

  • ' hardware ' → 'HARDWARE'
  • NULL → 'UNKNOWN'
  • ' ' → 'UNKNOWN'
  • 'Home Office' → 'HOME OFFICE'
Show solution
-- PostgreSQL
CREATE OR REPLACE FUNCTION clean_category(p_text text)
RETURNS text LANGUAGE sql AS $$
  SELECT COALESCE(NULLIF(UPPER(TRIM(p_text)), ''), 'UNKNOWN');
$$;

-- SQL Server
CREATE OR ALTER FUNCTION dbo.clean_category (@t varchar(200))
RETURNS varchar(200) AS
BEGIN
  DECLARE @x varchar(200) = UPPER(LTRIM(RTRIM(@t)));
  RETURN CASE WHEN @x IS NULL OR @x = '' THEN 'UNKNOWN' ELSE @x END;
END;

-- MySQL 8+
DELIMITER //
CREATE FUNCTION clean_category(p_text varchar(200))
RETURNS varchar(200) DETERMINISTIC
BEGIN
  SET @x = UPPER(TRIM(p_text));
  RETURN IF(@x IS NULL OR @x = '', 'UNKNOWN', @x);
END //
DELIMITER ;
Exercise 2 — REFRESH_TOP_PRODUCTS procedure

Create a reporting table if needed:

-- Example table
CREATE TABLE IF NOT EXISTS rpt_top_products (
  product_id int,
  total_qty bigint
);

Write a procedure REFRESH_TOP_PRODUCTS(p_days INT, p_min_qty INT) that:

  • Deletes all rows from rpt_top_products
  • Inserts the top 10 products by SUM(quantity) from table sales within the last p_days
  • Filters out rows where quantity < p_min_qty before aggregation
  • Sorts by total_qty DESC

Expected result: rpt_top_products contains up to 10 rows with columns (product_id, total_qty) for the selected window.

Show solution
-- PostgreSQL 11+
CREATE OR REPLACE PROCEDURE refresh_top_products(p_days int, p_min_qty int)
LANGUAGE plpgsql AS $$
BEGIN
  DELETE FROM rpt_top_products;
  INSERT INTO rpt_top_products(product_id, total_qty)
  SELECT product_id, SUM(quantity) AS total_qty
  FROM sales
  WHERE order_date >= CURRENT_DATE - p_days
    AND quantity >= p_min_qty
  GROUP BY product_id
  ORDER BY total_qty DESC
  LIMIT 10;
END;$$;

-- SQL Server
CREATE OR ALTER PROCEDURE dbo.refresh_top_products
  @p_days int,
  @p_min_qty int
AS
BEGIN
  SET NOCOUNT ON;
  BEGIN TRY
    BEGIN TRAN;
      DELETE FROM dbo.rpt_top_products;
      INSERT INTO dbo.rpt_top_products(product_id, total_qty)
      SELECT TOP 10 product_id, SUM(quantity) AS total_qty
      FROM dbo.sales
      WHERE order_date >= DATEADD(day, -@p_days, CONVERT(date, GETDATE()))
        AND quantity >= @p_min_qty
      GROUP BY product_id
      ORDER BY SUM(quantity) DESC;
    COMMIT TRAN;
  END TRY
  BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK TRAN;
    THROW;
  END CATCH
END;

-- MySQL 8+
DELIMITER //
CREATE PROCEDURE refresh_top_products(IN p_days int, IN p_min_qty int)
BEGIN
  START TRANSACTION;
  DELETE FROM rpt_top_products;
  INSERT INTO rpt_top_products(product_id, total_qty)
  SELECT product_id, SUM(quantity) AS total_qty
  FROM sales
  WHERE order_date >= (CURRENT_DATE - INTERVAL p_days DAY)
    AND quantity >= p_min_qty
  GROUP BY product_id
  ORDER BY total_qty DESC
  LIMIT 10;
  COMMIT;
END //
DELIMITER ;
  • [ ] I created the objects successfully without errors
  • [ ] My functions return expected values for the provided cases
  • [ ] The procedure repopulates the report table and respects parameters

Common mistakes and how to self-check

  • Using a function for writes: Many databases disallow DML in functions. Self-check: Does your function try to INSERT/UPDATE/DELETE?
  • Forgetting NULL handling: TRIM/UPPER on NULL stays NULL. Self-check: Add COALESCE/NULLIF where needed; test with NULL and blank inputs.
  • Unbounded refresh: Missing date filters in procedures. Self-check: Verify WHERE clauses; print row counts before and after in a safe environment.
  • Transaction safety: Not wrapping multi-step writes. Self-check: Intentionally force an error mid-procedure in a sandbox to ensure rollback works.
  • Parameter misuse: Comparing different types. Self-check: CAST parameters to expected types and validate ranges (e.g., days >= 0).

Practical projects

  • Reusable cleaning library: Functions for trimming, normalizing case, safe divide, defaulting NULLs.
  • Daily KPI refresh: Procedure that rebuilds daily KPIs (orders, revenue, conversion), with TRY/CATCH or EXCEPTION handling.
  • Ad-hoc toolkit: Procedure that accepts a date window and populates multiple reporting tables in one transaction.

Learning path

  • 1) Write two scalar functions (text cleanup, safe math)
  • 2) Create one table-refresh procedure with parameters
  • 3) Add error handling and a transaction
  • 4) Add basic logging (e.g., insert a row into a log table with start/end times)
  • 5) Reuse functions inside your procedures

Next steps

  • Finish the exercises above and run them against test data
  • Take the quick test to check your understanding
  • Note: The test is available to everyone; only logged-in users have their progress saved

Mini challenge

Create a function ACTIVE_FLAG(last_seen_date, max_days) that returns 'ACTIVE' if last_seen_date is within max_days of today, else 'STALE'. Then write a procedure REFRESH_ACTIVE_USERS(p_days) that rebuilds a table rpt_active_users(user_id, flag) using that function.

Hint
  • Function: compare last_seen_date >= CURRENT_DATE - max_days
  • Procedure: truncate/insert pattern with a transaction

Quick test

Ready? Take the short quiz below. It takes ~3–5 minutes. The test is available to everyone; only logged-in users have their results saved.

Practice Exercises

2 exercises to complete

Instructions

Write a function CLEAN_CATEGORY(text) that:

  • Trims whitespace
  • Converts to UPPERCASE
  • Returns 'UNKNOWN' if input is NULL or becomes empty after trimming

Test with: ' hardware ' → 'HARDWARE'; NULL → 'UNKNOWN'; ' ' → 'UNKNOWN'; 'Home Office' → 'HOME OFFICE'

Expected Output
HARDWARE; UNKNOWN; UNKNOWN; HOME OFFICE

Stored Procedures and Functions — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Stored Procedures and Functions?

AI Assistant

Ask questions about this tool