Why this matters
Grounding and citations make LLM answers trustworthy, auditable, and safe. As an NLP Engineer, you will build systems that must reduce hallucinations, attribute facts to evidence, and enable users to verify claims.
- Customer support bots: answers must rely on the latest policy or product docs.
- Internal tools: decisions (pricing, compliance) require traceable sources.
- Data analysis assistants: numbers and definitions should reference exact tables or sections.
Concept explained simply
Grounding means the model is constrained to produce answers that come from specific, allowed sources (documents, databases, APIs) or explicitly admits when evidence is missing. Citations are the visible references that show where each claim came from.
A helpful mental model
Think of the LLM as a careful researcher:
- It first gathers notes from approved sources (retrieval).
- It writes a summary using only those notes (generation).
- It tags each claim with a citation to the exact note (provenance).
If a claim is not in the notes, the researcher must say, "No evidence found" and optionally ask for more sources.
Key building blocks
- Grounding contract: A clear policy defining allowed sources, date ranges, and what to do when evidence is missing.
- Source trust levels: e.g., Tier 1 (official docs), Tier 2 (internal wikis), Tier 3 (user content). Prefer higher tiers when conflicts arise.
- Granularity: Store and cite at a useful chunk size (para/section/table row) with stable IDs and timestamps.
- Packaging context: Provide concise, relevant snippets with titles, IDs, and dates to the model.
- Citation format: Inline markers like [S1], [S2], or (Doc:Section) that map to a list of sources with titles, IDs, and dates.
- Refusal and fallback: If evidence is insufficient, return a helpful refusal or escalate to a human, per policy.
Example of a simple grounding contract
{
"allowed_sources": ["ProductManual v3", "PolicyPortal"],
"recency": "use latest version only",
"must_cite": true,
"refusal": "If no evidence, say 'I cannot verify this from the approved sources.'"
}
Worked examples
1) Product FAQ assistant
Task: "Does Model X support Wi‑Fi 6E?"
Retrieved: Manual v3, Section 2.3: "Model X supports Wi‑Fi 6 and 6E." (ID: MANV3-S2.3, 2025-02-10)
Grounded answer: "Yes, Model X supports Wi‑Fi 6E. [S1]"
Sources: [S1] Manual v3 §2.3 (ID: MANV3-S2.3, 2025-02-10)
2) Healthcare policy bot
Task: "Is telehealth covered for physical therapy?"
Retrieved: PolicyPortal Plan A §9.1: telehealth PT covered when provider is in-network. (ID: PLA-9.1, 2026-01-01)
Grounded answer: "Telehealth physical therapy is covered when your provider is in‑network. [S1]"
Sources: [S1] PolicyPortal Plan A §9.1 (ID: PLA-9.1, 2026-01-01)
3) SQL assistant
Task: "How to compute month‑over‑month growth?"
Retrieved: AnalyticsPlaybook §MoM (ID: APB-MOM, 2024-07-12)
Grounded answer: "Use lag to compare current vs previous month: growth = (m - lag(m)) / lag(m). [S1]"
Sources: [S1] AnalyticsPlaybook §MoM (ID: APB-MOM, 2024-07-12)
How to implement (step-by-step)
- Define the grounding contract
List allowed sources, freshness rules, and refusal behavior. - Prepare sources
Chunk by section/paragraph/table; assign stable IDs; keep timestamps. - Index and retrieve
Use lexical + semantic retrieval; filter by source tier and recency. - Package context for the LLM
Provide top‑k snippets with titles, IDs, and dates; include the contract in system instructions. - Generate with citation scaffolding
Prompt the model to attach inline markers [S#] after each factual claim. - Post‑check
Optionally run a verifier pass: every claim has at least one supporting snippet; remove unsupported claims or refuse. - Log provenance
Store answer text, snippet IDs, and model config for audit and debugging.
Exercises you can do now
These mirror the tasks in the Exercises panel below.
Exercise 1: Draft a grounding contract and citation plan
Scenario: A policy QA assistant for an HR department must answer questions about paid leave. Approved sources: "HR Handbook v5" and "Leave Policy Portal" (latest only). If evidence is missing, it should refuse and suggest contacting HR.
- Create: allowed sources, retrieval rules, packaging schema (ID, title, date), citation format, refusal policy.
Show example solution
Allowed sources: ["HR Handbook v5", "Leave Policy Portal (latest)"]
Retrieval: filter to latest versions; prefer Portal over Handbook on conflicts
Packaging: [{id, title, section, date, snippet}]
Citation: Inline [S1], [S2] with a source list mapping markers to {title, section, id, date}
Refusal: "I cannot verify this from approved HR sources. Please contact HR."
Exercise 2: Diagnose and fix grounding gaps
Given snippets:
- [S1] HR Handbook v5 §4.2: "Parental leave is 12 weeks paid for full‑time employees." (ID: HR5-4.2, 2026-01-01)
- [S2] Leave Policy Portal §Parental: "Parental leave is 16 weeks paid for full‑time employees (effective 2026‑03‑01)." (ID: LPP-PAR, 2026-03-01)
Model answer: "Parental leave is 12 weeks for full‑time and part‑time employees."
- Mark which claims are unsupported or outdated.
- Write a corrected grounded answer with citations.
Show example solution
Unsupported: "part‑time" claim (no evidence). Outdated: 12 weeks (S2 supersedes S1). Corrected: "Parental leave is 16 weeks paid for full‑time employees. [S2]"
- [Checklist] A grounded answer should:
- Use only approved sources.
- Include a citation for each factual claim.
- Prefer newer or higher‑tier sources when conflicting.
- Refuse or escalate when evidence is missing.
Common mistakes and self-check
- Missing citations: Self-check: Can you map every claim to a snippet ID? If not, revise or refuse.
- Over‑broad snippets: Self-check: Does the snippet actually contain the claim, not just a related keyword?
- Stale sources: Self-check: Are you citing the latest version by date or version tag?
- Ambiguous markers: Self-check: Are markers stable and unique (e.g., [S1] -> exact ID and date)?
- Conflicting sources: Self-check: Do you have a tie‑breaker (tiering, recency)? Note it in the contract.
Practical projects
- Grounded FAQ Bot: Index a product manual, implement retrieval, and add inline [S#] citations. Log provenance per answer.
- Policy Compare Tool: When sources conflict, show a diff and choose the higher‑tier/most recent source automatically.
- Verifier Pass: Build a small rule‑based or model‑based checker that flags unsupported sentences.
Mini challenge
Write a one‑paragraph answer to: "Can contractors access the internal KPI dashboard?" Use the snippets:
- [S1] Access Policy §2.1: "Only full‑time employees have default access to KPI Dashboard." (ID: AP-2.1, 2025-12-01)
- [S2] Access Exceptions §A: "Managers can request temporary access for contractors for up to 14 days." (ID: AE-A, 2026-02-10)
Include citations and a refusal if any claim is not covered.
Show reference answer
"By default, only full‑time employees can access the KPI Dashboard [S1]. Contractors may get temporary access if a manager submits a request, for up to 14 days [S2]."
Who this is for and prerequisites
- Who: NLP Engineers, Applied ML Engineers, and technical product owners building LLM apps.
- Prerequisites: Basic prompt engineering, retrieval fundamentals (BM25/embeddings), and comfort with JSON/IDs.
Learning path
- Before: Retrieval basics, prompt scaffolding.
- This lesson: Grounding and citations concepts, policies, and workflows.
- Next: Verifier pipelines, evaluation metrics (claim‑level grounding precision and citation coverage), and monitoring.
Next steps
- Adopt a grounding contract template in your project.
- Add inline citations to an existing LLM feature and measure user trust changes.
- Implement a refusal path and log all refusals for analysis.
Quick test note
The quick test is available to everyone. Sign in to save your progress and track completion.