Why this matters
As a Prompt Engineer, you will support users who write in different languages and scripts. Real tasks include: localizing chat assistants, extracting entities from multilingual documents, building classification pipelines that accept code-switched text (mixed languages), and ensuring the model answers in a specific language or dialect. Getting this wrong leads to off-language replies, mistranslations, cultural tone issues, or broken pipelines.
Who this is for
- Prompt Engineers building global features (chat, search, analytics).
- Data/ML folks who need consistent outputs across languages.
- Product teams localizing assistants, FAQs, or helpdesks.
Prerequisites
- Basic prompt design (roles, instructions, examples).
- Awareness of tokenization differences (e.g., CJK scripts vs. Latin).
- Familiarity with zero-shot and few-shot prompting.
Concept explained simply
Handling multi-language prompts means making your instructions work regardless of the user’s language, and controlling the language of the model’s output. You will often need: language detection, consistent output formats, optional translation steps, and cultural tone guardrails.
Mental model
Think of a multilingual pipeline as three switches you control:
- Input Language: detect or receive a hint (e.g., user_language: "es").
- Processing Mode: translate or not? choose consistent labels/output schema.
- Output Language: force one language or mirror the user’s.
Decide the switches, then write explicit instructions so the model follows them every time.
Core patterns you will use
- Language control: “Always answer in target_language.” or “Reply in the same language as the user.”
- Stable labels: Use language-agnostic labels (e.g., POSITIVE/NEGATIVE) or provide a label map for display.
- Few-shot in multiple languages: Include examples in the languages you expect.
- Glossaries: Provide brand or domain terms with canonical translations.
- Script rules: Specify romanization, simplified/traditional, or diacritics handling.
- Code-switch handling: Ask the model to detect segments and normalize before processing.
Language detection vs. explicit hints
Detection works but can fail on short text. If you have a user_language field, pass it and instruct the model to respect it. If not, ask the model to identify language first, then proceed.
Worked examples
Example 1 — Force output language (Portuguese-BR)
Instruction: You are a helpful assistant. Always answer in Brazilian Portuguese (pt-BR). If the user asks for a translation, provide it in pt-BR unless they explicitly request another language. If you lack information, say: "Não tenho informação suficiente."
User: Summarize: "The device pairs quickly and lasts 10 hours."
Good Output: "Resumo: O dispositivo conecta rapidamente e dura cerca de 10 horas."
Why it works: Clear language lock, fallback phrase, and consistent tone.
Example 2 — Code-switched input normalization
Task: Summarize mixed Spanish-English notes into English, preserving names and product codes.
Instruction: 1) Detect language segments. 2) Translate all content to English, but keep names and codes unchanged. 3) Produce a 2-sentence summary.
User: "Cliente dijo que el battery life is mediocre; prefiere model RX-200."
Good Output: "The customer said the battery life is mediocre and prefers model RX-200. Overall sentiment is slightly negative."
Example 3 — Bilingual few-shot with stable labels
Task: Sentiment classification for English and Spanish; labels must be in English.
Instruction: Classify the sentiment as one of {POSITIVE, NEGATIVE, NEUTRAL}. Output only JSON: {"label":"...","reason":"..."}. Keep labels in English even if input is Spanish.
Few-shot:
EN: "I love this camera" -> {"label":"POSITIVE","reason":"Expresses strong liking"}
ES: "No vale la pena" -> {"label":"NEGATIVE","reason":"Indicates it is not worth it"}
User: "Está bien, nada especial"
Good Output: {"label":"NEUTRAL","reason":"Moderate stance without strong polarity"}
Example 4 — Script and romanization control
Task: Extract a Japanese person name and give romaji.
Instruction: Extract the full name. Output JSON with fields: kanji, romaji (Hepburn). If unsure, return {"kanji":null,"romaji":null}.
User: "新製品の発表者は田中太郎でした。"
Good Output: {"kanji":"田中太郎","romaji":"Tanaka Tarō"}
Why it works: The prompt specifies the script and romanization system.
Quality checklist
- Output language explicitly specified or mirrored from user input.
- Labels and formats are stable across languages.
- Glossaries and locale rules are included when needed.
- Short inputs: include a fallback detection rule.
- Code-switched text: include a normalization step.
- Scripts/romanization specified where relevant.
- Provide a polite refusal or “not enough info” phrase in the chosen language.
Practice exercises
Complete these in your own environment or a notebook. The same exercises appear below with hints and solutions.
- Exercise 1: Bilingual sentiment classifier (English/Spanish) with English labels and JSON output.
- Exercise 2: Locale-aware FAQ assistant using a small bilingual glossary and controlled output language.
Self-check before submitting
- Did you state the output language rules explicitly?
- Are labels stable across languages?
- Is the JSON or schema valid for edge cases (empty input, mixed languages)?
- Did you include a fallback phrase for unknowns?
Common mistakes
- Underspecified language: The model replies in the input language unpredictably. Fix by explicitly locking or mirroring output language.
- Mixed labels: Some languages produce translated labels. Fix by instructing “labels must be in English only.”
- Glossary drift: Brand terms vary by language. Fix with a glossary and “never translate this list.”
- Loss of entities during translation: Names and codes change. Fix by instructing “preserve proper nouns and codes.”
- Script confusion: The model uses a different script. Fix by specifying script/romanization system.
How to self-check
- Test with short, ambiguous inputs in multiple languages.
- Test code-switched sentences.
- Test with domain jargon and your glossary.
- Verify JSON schemas with a validator.
Learning path
- Control output language and refusal phrasing.
- Add stable labels and schemas.
- Introduce bilingual few-shot examples.
- Add glossary and locale-specific tone rules.
- Handle code-switching and short text detection.
- Specify scripts/romanization when needed.
Practical projects
- Multilingual feedback triage: classify customer feedback in any language into {BUG, FEATURE, PRAISE, OTHER} with English labels and short reason.
- Locale-aware helpdesk: answer FAQs in the user’s language using a glossary and a refusal phrase if the answer isn’t known.
- Cross-lingual entity extraction: extract people/organizations from multilingual articles with canonical English labels, preserving original scripts.
Mini challenge
Create a prompt that accepts product reviews in any language, outputs: (1) language code, (2) sentiment label (English), (3) 1-sentence summary in the input language, (4) safety note if the text includes hate or harassment. Use JSON only.
Suggested solution
{"role":"system","content":"You classify and summarize product reviews. Steps: 1) Detect language. 2) Classify sentiment into {POSITIVE, NEGATIVE, NEUTRAL} in English. 3) Summarize in the same language as the input in one sentence. 4) If the text includes hate or harassment, set safety_note to 'Contains harmful language'; otherwise null. Output only JSON with keys: lang, label, summary, safety_note. If unsure, label NEUTRAL and safety_note null."}About your progress
The quick test below is available to everyone. If you are logged in, your progress will be saved automatically.