How to learn Prompting For Structured Outputs for Domain Adaptation And Knowledge in Prompt Engineer for free

Why this matters

As a Prompt Engineer, you often need the model to return machine-parseable outputs that feed into pipelines, dashboards, or downstream code. Typical tasks include:

Extracting entities from messy text into strict JSON for ETL.
Generating CSV rows for bulk imports (products, contacts, tickets).
Creating XML/YAML payloads for integrations or configuration.
Producing consistent schemas for evaluation datasets and test harnesses.
Enforcing domain-specific formats (e.g., ICD codes, ISO country codes).

Getting the format wrong causes parsing errors, broken automations, and wasted review time. This lesson shows how to reliably lock formats.

Who this is for

Prompt Engineers and Data/ML folks who hand off LLM outputs to code.
Analysts building structured datasets from unstructured sources.
Anyone orchestrating multi-step LLM workflows with strict formats.

Prerequisites

Basic familiarity with JSON/CSV/XML.
Comfort writing concise, explicit prompts.
Optional: experience parsing data with your favorite language.

Concept explained simply

Think of structured prompting as a contract:

Contract: You define a schema and exact format rules.
Serializer: The model fills the schema with content.
Validator: You (or your code) check the output strictly.

Mental model

Structure first, content second. Always tell the model what the container looks like before what to put inside. Use strong constraints and reminders.

Format-lock phrases you can reuse

Return ONLY valid JSON. No comments, no code fences, no extra text.
If unsure, use null. Use double quotes for all keys and strings.

Return ONLY CSV. First row is header. No extra lines. Use commas.

Return ONLY XML. Encode special characters (& < > ") correctly.

Patterns and templates

JSON template

System: You are a formatter that outputs only valid JSON.
User: Extract fields from the text below.
Rules:
- Output ONLY a single JSON object.
- Keys: ["title","tags","priority","due_date","description"]
- Types: title:string, tags:array of strings, priority: one of ["low","medium","high"],
         due_date: ISO date string (YYYY-MM-DD) or null, description:string
- Do not include explanations.

Text: "...paste text..."

CSV template

System: You output only CSV.
User: Create product rows.
Rules:
- First row is the header: id,name,price_usd,category
- Exactly 3 rows follow.
- No extra lines, no quotes unless needed, comma separator.

Input: "..."

XML template

System: Output only XML, UTF-8.
User: Produce <ticket> with child tags: title, severity, owner, steps (list of step), and labels (list of label).
Rules:
- Wrap in a single root <ticket>.
- Escape special characters.
- severity in {"low","medium","high"}.

Input: "..."

Worked examples

Example 1 — JSON extraction from messy text

Goal: Extract shipping details into strict JSON.

Prompt

Return ONLY valid JSON. No code fences or comments.
Keys and types:
- order_id: string
- items: array of { sku:string, qty:integer }
- ship_to: { name:string, city:string, country_iso2:string }
- express: boolean
If data missing, use null (and empty array for items).

Text:
"Order #A-1049. Need 2x SKU-XL-BLK and one SKU-CAP. Ship: Sam Lee, Berlin, DE. Rush shipping please!"

Possible output

{
  "order_id": "A-1049",
  "items": [
    {"sku": "SKU-XL-BLK", "qty": 2},
    {"sku": "SKU-CAP", "qty": 1}
  ],
  "ship_to": {"name": "Sam Lee", "city": "Berlin", "country_iso2": "DE"},
  "express": true
}

Example 2 — CSV generation with header

Goal: Generate import-ready product CSV.

Prompt

Return ONLY CSV. First row is header: id,name,price_usd,category
Exactly 3 rows follow. Use integers for id and price with two decimals.
No extra lines.

Products to list: minimalist chair, standing desk, task lamp
Categories: furniture, furniture, lighting
Suggested prices: 89.00, 359.00, 49.00

Possible output

id,name,price_usd,category
1,Minimalist Chair,89.00,furniture
2,Standing Desk,359.00,furniture
3,Task Lamp,49.00,lighting

Example 3 — XML configuration

Goal: Create an XML ticket for a support system.

Prompt

Return ONLY XML. Root <ticket>.
Required children: <title> <severity> <owner> <steps> <labels>.
- severity: one of low, medium, high
- steps: multiple <step> nodes
- labels: multiple <label> nodes
No comments, no prolog, no extra text.

Issue: App crashes after clicking "Export" on large files.
Owner: d.chen
Labels: export, crash, priority
Severity: high

Possible output

<ticket>
  <title>Crash when exporting large files</title>
  <severity>high</severity>
  <owner>d.chen</owner>
  <steps>
    <step>Open app</step>
    <step>Load large file (>500MB)</step>
    <step>Click Export</step>
  </steps>
  <labels>
    <label>export</label>
    <label>crash</label>
    <label>priority</label>
  <labels>
</ticket>

How to write robust prompts

State the format first: “Return ONLY valid JSON/CSV/XML.”
Specify keys/columns and types.
Constrain values: enumerations, ISO codes, regex hints.
Define missing-data behavior: null, empty array, or empty string.
For CSV: header row, separator, quoting rules, exact row count.
For XML: root element name, element order, escaping rules.
Ban extra text: no explanations, no code fences, no comments.
Add a last-line reminder: “If unsure, output null fields, not explanations.”

Common mistakes and self-check

Including code fences or explanations around the data. Fix: explicitly say “no code fences, no extra text.”
Smart quotes or trailing commas in JSON. Fix: ask for double quotes only; avoid comments.
Wrong separators in CSV (e.g., semicolons). Fix: name the separator.
Missing header row in CSV. Fix: explicitly require it.
Invalid enums (e.g., severity: urgent). Fix: list allowed values.
XML special characters not escaped. Fix: mention escaping explicitly.

Self-check before you use the output

JSON: Can it parse with a strict JSON parser? Are all keys present?
CSV: Exactly one header row and N data rows? Right separator? No extra blank lines?
XML: Single root? Valid nesting? Special characters escaped?

Exercises

Note: Everyone can take the exercises and quick test. Only logged-in users have their progress saved.

Exercise 1 — JSON incident report

Create a prompt that converts a freeform incident note into strict JSON with this schema:

{
  "id": string,
  "severity": one of ["low","medium","high"],
  "services": array of strings,
  "started_at": ISO-8601 datetime or null,
  "impact_summary": string
}

Input text to handle:

INC-9087 Major outage on checkout + payments since 09:14 UTC. Users report 5xx. Affected: checkout, payments. Severity: HIGH.

Write the full prompt that enforces the schema and bans extra text.
Then provide an example of a correct model output.

Show a sample solution

Prompt:
Return ONLY valid JSON. No code fences or explanations.
Schema and rules:
- Keys: id (string), severity ("low"|"medium"|"high"), services (array of strings),
  started_at (ISO-8601 string or null), impact_summary (string)
- Use double quotes for all keys and strings.
- If a value is uncertain, use null.

Text:
"INC-9087 Major outage on checkout + payments since 09:14 UTC. Users report 5xx. Affected: checkout, payments. Severity: HIGH."

Expected output example:
{
  "id": "INC-9087",
  "severity": "high",
  "services": ["checkout", "payments"],
  "started_at": "2026-01-08T09:14:00Z",
  "impact_summary": "Users receive 5xx errors on checkout and payments."
}

Exercise 2 — CSV product feed

Write a prompt that makes the model output ONLY CSV for three rows of products with this exact header and rules:

Header: sku,name,price_usd,in_stock
Rules:
- Exactly 3 data rows.
- price_usd has two decimals.
- in_stock is true or false (lowercase).
- Use comma as separator.
- No extra lines or spaces.

Input products: snow boots 79.99 true; rain jacket 59.00 false; thermal socks 9.50 true

Show a sample solution

Prompt:
Return ONLY CSV. First row is header: sku,name,price_usd,in_stock
Exactly 3 rows follow. Use comma as separator. No extra lines.
- price_usd has two decimals.
- in_stock is true or false.

Data:
- snow boots | SKU-SB-001 | 79.99 | true
- rain jacket | SKU-RJ-002 | 59.00 | false
- thermal socks | SKU-TS-003 | 9.50 | true

Example output:
sku,name,price_usd,in_stock
SKU-SB-001,Snow Boots,79.99,true
SKU-RJ-002,Rain Jacket,59.00,false
SKU-TS-003,Thermal Socks,9.50,true

Exercise checklist

Format-lock phrase present at the top.
Keys/columns and types are explicit.
Enumerations or allowed values are listed.
Missing-data behavior is defined.
No code fences, comments, or extra explanations.

Mini challenge

Design a prompt that extracts a job posting into JSON Lines (one line per variant) with fields: title, company, location, salary_min, salary_max, currency, remote (boolean), skills (array). Require EXACTLY 2 variants: a strict extraction and a normalized version (e.g., inferred salary if missing → null). Output must be two JSON objects separated by a single newline, no extra text.

Tip

State: “Output ONLY two JSON objects, one per line.”
Define types and allowed currencies.
For missing salary, use nulls and keep currency consistent if known.

Learning path

Master format-lock prompts (JSON/CSV/XML) and strict rules.
Add value constraints and validation thinking (enums, ISO formats).
Combine with evaluation: test prompts on edge cases and malformed inputs.

Practical projects

Resume parser: Convert resumes into a hiring JSON schema and build a small validator.
Support triage: Turn chat transcripts into CSV tickets with severity and tags.
Catalog normalizer: Generate clean product feeds (CSV) from supplier PDFs.

Next steps

Introduce automatic validation in your pipeline and iterate prompts when parsing fails.
Add domain constraints (industry codes, region lists, internal IDs) to improve reliability.
Prepare a prompt library of reusable format-lock templates for your team.

Menu

Prompting For Structured Outputs

Table of Contents

Why this matters

Who this is for

Prerequisites

Concept explained simply

Mental model

Patterns and templates

JSON template

CSV template

XML template

Worked examples

Example 1 — JSON extraction from messy text

Example 2 — CSV generation with header

Example 3 — XML configuration

How to write robust prompts

Common mistakes and self-check

Exercises

Exercise 1 — JSON incident report

Exercise 2 — CSV product feed

Exercise checklist

Mini challenge

Learning path

Practical projects

Next steps

Practice Exercises

JSON incident report

Instructions

Expected Output

CSV product feed

Prompting For Structured Outputs — Quick Test

Have questions about Prompting For Structured Outputs?

AI Assistant