luvv to helpDiscover the Best Free Online Tools
Topic 26 of 31

Data Import and Export Copy Load Unload

Learn Data Import and Export Copy Load Unload for free with explanations, exercises, and a quick test (for Data Analyst).

Published: December 19, 2025 | Updated: December 19, 2025

Why this matters

As a Data Analyst, your data rarely starts inside your database in a perfect table. You need to pull in CSVs from marketing, export clean datasets for teammates, and move large files efficiently. Knowing how to import and export with COPY/LOAD/UNLOAD saves hours, reduces errors, and keeps pipelines reproducible.

  • Load weekly CSVs from vendors into staging tables
  • Export filtered results for Finance as a clean CSV
  • Bulk insert millions of rows for faster analytics
  • Unload large query results to cloud storage for sharing

Concept explained simply

Import means bringing rows from a file into a table. Export means writing rows from a table or query to a file. The database needs to know file path, format (delimiter, quote, escape), header presence, null representation, and sometimes credentials (cloud storage).

Mental model

Think of a conveyor belt feeding rows into your table. You set up the belt with rules: where the belt starts (file path), how boxes are separated (delimiter), how labels are protected (quotes/escape), and how to handle missing labels (NULL). Export is the same belt in reverse.

Quick reference by database

PostgreSQL

Server-side import/export (requires server file access):

COPY schema.table (col1, col2)
FROM '/var/load/data.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',', QUOTE '"', ESCAPE '"');

COPY (SELECT * FROM schema.table)
TO '/var/export/out.csv'
WITH (FORMAT csv, HEADER true);

Client-side via psql (works with your local files):

\copy schema.table (col1, col2)
FROM 'data.csv' WITH (FORMAT csv, HEADER true);

\copy (SELECT col1, col2 FROM schema.table)
TO 'out.csv' WITH (FORMAT csv, HEADER true);
MySQL / MariaDB
LOAD DATA [LOCAL] INFILE 'data.csv'
INTO TABLE db.table
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(col1, col2, @raw_date)
SET date_col = STR_TO_DATE(@raw_date, '%Y-%m-%d');

SELECT col1, col2
INTO OUTFILE '/var/lib/mysql-files/out.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'
FROM db.table;

Note: SELECT ... INTO OUTFILE writes on the server; choose a server-writable directory.

SQL Server
BULK INSERT dbo.TableName
FROM 'C:\\data\\file.tsv'
WITH (
  FIELDTERMINATOR='\t',
  ROWTERMINATOR='\n',
  FIRSTROW = 2,
  TABLOCK
);

-- Export with bcp (command line)
-- bcp "SELECT col1, col2 FROM db.dbo.TableName" queryout "out.csv" -c -t, -T -S localhost
Amazon Redshift
COPY schema.table
FROM 's3://your-bucket/path/data/'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyRedshiftRole'
CSV IGNOREHEADER 1 DELIMITER ',' QUOTE '"' GZIP;

UNLOAD ('SELECT * FROM schema.table')
TO 's3://your-bucket/path/exports/out_'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyRedshiftRole'
FORMAT AS CSV HEADER;
Snowflake
-- Stage a file then load
COPY INTO schema.table
FROM @mystage/path/
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY='"' SKIP_HEADER=1);

-- Export
COPY INTO @mystage/exports/out_
FROM (SELECT * FROM schema.table)
FILE_FORMAT = (TYPE=CSV FIELD_DELIMITER=',' HEADER=TRUE);

Worked examples

Example 1: PostgreSQL — Import a CSV locally with psql and export a subset
-- Table
CREATE TABLE public.customers (
  id INT,
  name TEXT,
  email TEXT,
  signup_date DATE
);

-- customers.csv (local)
-- id,name,email,signup_date
-- 1,Ada Lovelace,ada@example.com,2024-05-01
-- 2,Grace Hopper,grace@example.com,2024-05-03
-- 3,Alan Turing,alan@example.com,2024-06-10

-- Import (client-side)
\copy public.customers FROM 'customers.csv' WITH (FORMAT csv, HEADER true);

-- Quick check
SELECT COUNT(*) FROM public.customers; -- 3

-- Export only June signups
\copy (
  SELECT id, name, email FROM public.customers WHERE signup_date >= '2024-06-01'
) TO 'june_customers.csv' WITH (FORMAT csv, HEADER true);
Example 2: MySQL — LOAD DATA with date parsing and export
CREATE TABLE customers (
  id INT,
  name VARCHAR(100),
  email VARCHAR(255),
  signup_date DATE
);

-- customers_semicolon.csv
-- id;name;email;signup
-- 1;"Ada Lovelace";ada@example.com;01/05/2024
-- 2;"Grace Hopper";grace@example.com;03/05/2024
-- 3;"Alan Turing";alan@example.com;10/06/2024

LOAD DATA LOCAL INFILE 'customers_semicolon.csv'
INTO TABLE customers
FIELDS TERMINATED BY ';' ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(id, name, email, @signup)
SET signup_date = STR_TO_DATE(@signup, '%d/%m/%Y');

SELECT COUNT(*) FROM customers; -- 3

-- Export (server-side path must be writable by MySQL)
SELECT id, name, email, signup_date
INTO OUTFILE '/var/lib/mysql-files/customers_out.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'
FROM customers;
Example 3: SQL Server — BULK INSERT from TSV
CREATE TABLE dbo.Sales (
  order_id INT,
  sku NVARCHAR(50),
  qty INT,
  price DECIMAL(10,2)
);

-- sales.tsv (tab separated, first row header)
-- order_id	sku	qty	price
-- 101	A-1	2	19.99
-- 102	B-2	1	5.00

BULK INSERT dbo.Sales
FROM 'C:\\data\\sales.tsv'
WITH (
  FIELDTERMINATOR='\t',
  ROWTERMINATOR='\n',
  FIRSTROW=2
);

-- Export via bcp (shell)
-- bcp "SELECT order_id, sku, qty, price FROM YourDb.dbo.Sales" queryout "C:\\data\\sales_out.csv" -c -t, -T -S YourServer
Example 4: Redshift — COPY from S3 and UNLOAD
CREATE TABLE staging.events (
  user_id VARCHAR(50),
  event_time TIMESTAMP,
  event_type VARCHAR(50)
);

COPY staging.events
FROM 's3://company-raw/events/2024/06/'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyRedshiftRole'
CSV IGNOREHEADER 1 TIMEFORMAT 'auto' GZIP;

UNLOAD ('SELECT event_type, COUNT(*) c FROM staging.events GROUP BY 1')
TO 's3://company-

Data preparation checklist

  • Confirm delimiter (comma, semicolon, tab) and text qualifier (quotes)
  • Check header row presence
  • Identify encoding (UTF-8 recommended)
  • Define NULL representation (empty, \N, NULL)
  • Validate data types and date formats
  • Ensure file path permissions (server or local, depending on command)
  • For cloud, verify credentials/role and bucket paths

Step-by-step: reliable import

  1. Create a staging table: mirror columns as text if unsure; transform later.
    CREATE TABLE staging.raw_customers (
      id TEXT, name TEXT, email TEXT, signup TEXT
    );
    
  2. Load file: use COPY/LOAD with the right options (delimiter, header, quote).
  3. Validate: count rows, sample a few, check nulls and extremes.
    SELECT COUNT(*), COUNT(DISTINCT email) FROM staging.raw_customers;
    
  4. Transform into typed table: cast and clean in INSERT...SELECT.
    INSERT INTO dim.customers (id, name, email, signup_date)
    SELECT CAST(id AS INT), name, LOWER(email), CAST(signup AS DATE)
    FROM staging.raw_customers;
    
  5. Export if needed: write to CSV from a SELECT with explicit columns and order.

Common mistakes and how to self-check

  • Mixing server vs client paths: COPY uses server path; psql \copy uses your machine path. Self-check: Does the file exist on the database server or on your laptop?
  • Wrong delimiter/quotes: Commas inside values break rows. Self-check: Open the file and inspect a few lines; adjust DELIMITER and QUOTE/ENCLOSED BY.
  • Header row imported as data: Forgot HEADER/IGNORE 1. Self-check: Look for a row with column names.
  • Windows vs Unix line endings: Use correct ROWTERMINATOR (\r\n on Windows). Self-check: File created on Windows? Choose \r\n.
  • NULL handling: Empty strings vs NULL get confused. Self-check: Use NULL 'NULL' or treat empty as NULL where supported.
  • Permissions: INTO OUTFILE fails without server dir permissions. Self-check: Use a server-writable directory and required privileges.

Security and performance tips

  • Prefer server-side bulk operations for large data; they are faster than client round-trips.
  • Restrict file system paths; avoid arbitrary server paths for security.
  • Validate and sanitize filenames; never concatenate unchecked user input into file paths.
  • Use staging tables and explicit casts to avoid silent truncation.
  • For very large loads, disable/enable indexes appropriately (where safe) and use batch sizes.
  • Compress files (e.g., GZIP) when supported; it reduces network IO.

Exercises

These mirror the tasks in the Exercises section below. You can complete them conceptually or run them if you have the respective database locally.

Exercise 1 (PostgreSQL): Import and export customers

Create table:

CREATE TABLE public.customers (
  id INT,
  name TEXT,
  email TEXT,
  signup_date DATE
);

File customers.csv:

id,name,email,signup_date
1,Ada Lovelace,ada@example.com,2024-05-01
2,Grace Hopper,grace@example.com,2024-05-03
3,Alan Turing,alan@example.com,2024-06-10
4,Barbara Liskov,barbara@example.com,2024-06-15
  1. Import with psql \copy using FORMAT csv and HEADER.
  2. Verify row count (expect 4).
  3. Export only June signups to june_customers.csv with a header.
  • Checklist: Used \copy not COPY for local file? Correct delimiter and header? Filtered by signup_date \u2265 '2024-06-01'?
Exercise 2 (MySQL): Clean load with semicolons and date conversion

Create table:

CREATE TABLE customers2 (
  id INT,
  name VARCHAR(100),
  email VARCHAR(255),
  signup_date DATE
);

File customers_semicolon.csv:

id;name;email;signup
1;"Ada Lovelace";ada@example.com;01/05/2024
2;"Grace Hopper";grace@example.com;03/05/2024
3;"Alan Turing";alan@example.com;10/06/2024
  1. LOAD DATA LOCAL INFILE with FIELDS TERMINATED BY ';', ENCLOSED BY '"', IGNORE 1 LINES.
  2. Convert signup dd/mm/yyyy to DATE using STR_TO_DATE.
  3. Export to customers2_out.csv using SELECT ... INTO OUTFILE with comma delimiter and header emulation (by UNION ALL for header or output without header if not required).
  • Checklist: Correct delimiter? Date parsed? Server-writable OUTFILE directory?

Practical projects

  • Vendor pipeline: Build a repeatable script to load weekly vendor CSVs into a staging schema, transform to typed tables, and export a weekly summary report.
  • Marketing attribution export: Export campaign performance by channel and date to a CSV consumed by a dashboard tool.
  • Cloud unload: In Redshift or Snowflake, UNLOAD a 3-month rolling dataset to cloud storage with a timestamped file name.

Who this is for

  • Data Analysts who need to move data between files and databases
  • Analytics Engineers building lightweight pipelines
  • Anyone supporting CSV/TSV data handoffs across teams

Prerequisites

  • Basic SQL (SELECT, INSERT, data types)
  • Comfort with command-line tools (psql, mysql, bcp) is helpful
  • Access to a local or sandbox database (optional for hands-on)

Learning path

  • Start: Learn delimiters, text qualifiers, and header handling
  • Practice: Do small imports with staging tables and validations
  • Advance: Handle large files, compression, and server-side exports
  • Capstone: Create an end-to-end import-transform-export workflow

Common mistakes recap

  • Using server-only commands with local paths (and vice versa)
  • Forgetting HEADER/IGNORE 1, importing column names as data
  • Mishandling Windows line endings
  • Not validating NULLs and empty strings

Mini challenge

You receive a 2GB CSV with semicolons, quoted text, and a header. Dates are dd-mm-yyyy. Describe the exact command (for your DB) and options you would use to load into a staging table, verify counts, and export only the last 30 days to a new CSV. Include delimiter, quote, escape, header, and date parsing choices.

Next steps

  • Automate imports on a schedule (cron, orchestrators) using your chosen DB tooling
  • Add data quality checks (row counts, null thresholds, duplicates)
  • Document your import/export conventions for your team

Quick Test

Take the quick test below to check your understanding. Available to everyone; logged-in users get saved progress.

Practice Exercises

2 exercises to complete

Instructions

Create the table, load data from a local CSV using psql, then export a filtered subset.

-- Table
CREATE TABLE public.customers (
  id INT,
  name TEXT,
  email TEXT,
  signup_date DATE
);

-- File: customers.csv
id,name,email,signup_date
1,Ada Lovelace,ada@example.com,2024-05-01
2,Grace Hopper,grace@example.com,2024-05-03
3,Alan Turing,alan@example.com,2024-06-10
4,Barbara Liskov,barbara@example.com,2024-06-15
  • Task A: Import customers.csv using \copy with FORMAT csv, HEADER true.
  • Task B: Validate with SELECT COUNT(*) (expect 4).
  • Task C: Export only June signups (signup_date \u2265 '2024-06-01') to june_customers.csv with a header.
Expected Output
After import: COUNT(*) = 4. Exported file june_customers.csv contains 2 rows (ids 3 and 4) plus header.

Data Import and Export Copy Load Unload — Quick Test

Test your knowledge with 8 questions. Pass with 70% or higher.

8 questions70% to pass

Have questions about Data Import and Export Copy Load Unload?

AI Assistant

Ask questions about this tool