Why this skill matters for Prompt Engineers
Prompt Engineering Foundations teach you how to turn messy goals into reliable model behavior. As a Prompt Engineer, you will design instructions, constraints, and context so language models produce useful, predictable outputs. Mastering foundations unlocks tasks like: high-accuracy extraction, consistent formatting for pipelines, safe summarization, decision support, and robust evaluation setups.
What this skill does not cover
This page focuses on prompt patterns and reliability. It does not dive into tool calling, retrieval-augmented generation (RAG), fine-tuning, or safety red-teaming. Those come after you master the basics here.
Who this is for
- Aspiring Prompt Engineers starting with real-world tasks
- Data scientists/analysts who want consistent, parsable outputs
- Product managers and engineers integrating LLMs into features
Prerequisites
- Basic understanding of what large language models can do (text in/out)
- Comfort writing clear instructions and reading JSON
- Optional: familiarity with sampling settings (temperature, top_p)
Learning path (roadmap)
- Know the model: capabilities, limits, and non-determinism. Set realistic expectations.
- Instruction hierarchy: establish priority (goal, constraints, steps, style, failure behavior).
- Prompt structure: clear sections, separators, and unambiguous formatting.
- Zero-shot and few-shot: when to add examples, what makes a good example set.
- Role and context framing: specify audience, task role, and what to ignore.
- Output control: schemas, delimiters, style guides, and refusal/fallback rules.
- Determinism and variance: reduce output variance with settings and prompt patterns.
- Evaluate and iterate: create test cases, compare variants, and document decisions.
Milestone check: ready to ship a prompt?
- You can write a prompt with a clear instruction hierarchy and explicit output schema.
- You can add 2–4 diverse examples to stabilize behavior when needed.
- You can explain how to reduce variance and when to accept some randomness.
- You maintain a small test set and iterate based on failures.
Worked examples (3–6)
1) Structured extraction with strict JSON
Goal: Extract complaint type, urgency, and summary from a message. Return strict JSON or a controlled error.
Goal: Extract fields from a customer message.
Instructions (in order of priority):
1) Return ONLY valid JSON that conforms to this schema:
{
"complaint_type": "billing | technical | account | other",
"urgency": "low | medium | high",
"summary": "short string (max 25 words)"
}
2) If information is missing, infer conservatively; do NOT invent facts.
3) If you cannot produce valid JSON, return: {"error":"unresolvable"}
4) Style: no explanations, no extra keys.
Context:
Customer message: <msg>My card keeps getting charged twice. Please fix today!</msg>Why this works
It sets a clear hierarchy, provides an explicit schema, defines a failure path, and blocks extra text.
2) Few-shot classification to reduce ambiguity
Goal: Classify product feedback sentiment with calibrated categories.
Goal: Classify sentiment into {negative, mixed, neutral, positive}.
Rules:
- Choose the single best category.
- If sarcasm or unclear, prefer "mixed".
- Output JSON: {"sentiment":"..."}
Examples:
- Text: "Love the idea, but the app crashes" → {"sentiment":"mixed"}
- Text: "Works as expected" → {"sentiment":"neutral"}
- Text: "Amazing experience!!" → {"sentiment":"positive"}
Classify this:
Text: "Not great. It loads, eventually."
Return: JSON only.Why this works
Examples shrink the decision space and resolve edge cases. JSON-only output ensures pipeline compatibility.
3) Role and context framing for better summaries
Role: You are a technical editor preparing an executive summary for non-technical stakeholders.
Goal: Summarize the text at a 9th-grade reading level.
Constraints:
- 5 bullet points
- Each bullet <= 16 words
- No internal jargon; define acronyms if used
Text:
<doc>...content here...</doc>
Output: bullets only. No preamble.Why this works
Role clarifies tone, audience, and expertise. Constraints enforce length and readability.
4) Disambiguating with instruction hierarchy
Primary objective: Create a step-by-step procedure to install the CLI.
Hard constraints:
- Assume macOS 13+
- Use Homebrew
- Output as a numbered list of EXACTLY 6 steps
Failure behavior: If CLI is unavailable, output {"error":"unavailable"}
Ignore: marketing claims, unrelated tools.
User input: "How do I set it up?"Why this works
Separating objective, constraints, failure behavior, and ignore list reduces conflicts and hallucinations.
5) Reducing variance for status updates
When your platform allows model settings, pair a tight prompt with low-variance sampling.
Goal: Produce a terse daily status.
Constraints:
- 3 bullets, each <= 12 words
- Start each bullet with a verb
- If no updates, output: "No material changes."
Output only the bullets.
Operator settings suggestion: temperature ~ 0.0–0.2, top_p ~ 1.0, max_tokens capped.Why this works
Clear structure plus low sampling randomness leads to consistent length and style.
Drills and quick exercises
- Rewrite a vague instruction into a hierarchy: goal, constraints, style, failure behavior.
- Create a JSON schema for a task you run often; add one edge case rule.
- Write 3 diverse few-shot examples for a tricky classification (include borderline cases).
- Practice role framing for two audiences: executives vs. developers.
- Design a refusal path: what should the model output if constraints conflict?
- Run the same prompt 3 times; note variance and how constraints affect it.
Determinism and variance management
LLMs can produce different outputs for the same input. Reduce variance by:
- Constrain outputs: fixed schemas, counts, and delimiters.
- Prefer zero-shot with tight constraints for simple tasks; add few-shots for ambiguous ones.
- When available, use lower temperature and stable decoding. Cap max tokens.
- Minimize randomness in wording: avoid open-ended adjectives; specify exact counts/lengths.
Variance checklist
- Is the output format machine-checkable?
- Do you define what to do when info is missing?
- Are examples diverse and unambiguous?
- Did you remove non-essential creativity?
Common mistakes and debugging tips
- Mistake: Vague output spec (e.g., "summarize"). Fix: Specify audience, length, structure, and forbidden content.
- Mistake: Conflicting rules. Fix: Introduce priority order and a failure behavior.
- Mistake: Asking for detailed chain-of-thought. Fix: Request concise reasoning or final answers only.
- Mistake: Too many examples of one type. Fix: Add diverse, borderline examples.
- Mistake: No error path. Fix: Define an explicit error token or JSON for unresolvable cases.
- Mistake: Overfitting to one scenario. Fix: Maintain a small, varied test set and measure generalization.
Debugging flow
- Create a minimal reproducible prompt with 1 input.
- Change one thing at a time (constraint, example, role).
- Compare variants on 5–10 test cases.
- Keep a change log of what improved or regressed.
Mini project: AI Support Triage Assistant
Build a prompt that triages incoming support messages into categories with priority and next action.
- Define objectives: classify into {billing, technical, account, other}; set priority {low, medium, high}.
- Design instruction hierarchy: goal, hard constraints, style, failure behavior.
- Create output schema with strict JSON and a fallback error.
- Add 3–4 few-shot examples (easy, ambiguous, misleading, and edge case).
- Role framing: support analyst; audience: ticketing system (no extra text).
- Variance plan: constrain counts and wording; suggest low temperature if available.
- Test set: 10 messages; track accuracy and JSON validity.
Starter template
Role: You are a support triage analyst.
Goal: Categorize and prioritize a customer message and suggest the next action.
Output ONLY JSON with this schema:
{
"category": "billing | technical | account | other",
"priority": "low | medium | high",
"next_action": "short imperative sentence (max 12 words)"
}
Rules:
- If info is insufficient, choose the safest conservative option.
- If you cannot produce valid JSON, output {"error":"unresolvable"}.
- No extra keys or text.
Few-shot examples:
1) "I was charged twice" → {"category":"billing","priority":"high","next_action":"Escalate to billing to reverse duplicate charge"}
2) "Password reset link fails sometimes" → {"category":"account","priority":"medium","next_action":"Ask for account email and resend secure reset link"}
Classify this message:
<msg>...customer text...</msg>Subskills
- Understanding Model Capabilities And Limits
Outcome: You can set realistic expectations, spot hallucination risks, and design conservative fallbacks.
Estimated time: 45–75 min - Instruction Hierarchy And Constraint Design
Outcome: You can separate goals, constraints, style, and failure paths with clear priorities.
Estimated time: 60–90 min - Prompt Structure And Formatting
Outcome: You can produce clean, sectioned prompts with delimiters and schemas.
Estimated time: 45–60 min - Few Shot And Zero Shot Prompting
Outcome: You can decide when to use examples and design them for coverage and clarity.
Estimated time: 60–90 min - Role And Context Framing
Outcome: You can set audience and tone to improve relevance and reduce fluff.
Estimated time: 45–75 min - Output Control And Style Consistency
Outcome: You can enforce JSON, lists, counts, and consistent voice.
Estimated time: 60–90 min - Determinism And Variance Management
Outcome: You can reduce randomness and stabilize results across runs.
Estimated time: 45–75 min
Next steps
- Expand your test set and start tracking simple metrics: format validity rate, accuracy on labeled tasks.
- Move to advanced skills: retrieval-augmented generation, function/tool calling, and safety constraints.
- When ready, take the skill exam below. Anyone can take it for free; logged-in learners save progress.
Skill exam
This exam checks your ability to design clear prompts, choose examples, enforce output formats, and manage variance. You can retake it anytime. Progress is saved for logged-in users only.