📡 You're offline — showing cached content
New version available!
Quick Access
Tutorials Generative AI Engineering Hallucination Prevention

Hallucination Prevention

5 min read Quiz at the end
Prevent hallucinations with RAG, uncertainty prompts, required citations, CoT verification, and low temperature.

Preventing LLM Hallucinations

Hallucinations are confident but incorrect LLM outputs — a fundamental challenge requiring multiple defences.

# Main causes
# 1. Knowledge gaps (model generates rather than admits ignorance)
# 2. Conflation of facts from different sources
# 3. High temperature settings
# 4. Leading questions that assume false premises

# Defence 1: RAG (ground answers in retrieved facts)
system = 'Answer using ONLY the provided context.'
         + ' Say you do not know if not in context.'

# Defence 2: Uncertainty prompts
prompt = '''Answer the question.
If you are not 100% confident, say: I am not certain -- please verify.
Never guess or make up facts or names.'''

# Defence 3: Require citations
prompt = 'After each fact, cite the source: [Source: doc_name]'

# Defence 4: Chain-of-thought verification
prompt = '''Answer the question.
Then verify: Does my answer match only the provided context?
Are there any claims without evidence? If so, remove them.'''

# Defence 5: Low temperature for factual tasks
temperature = 0.0  # deterministic, no creativity

# Defence 6: Post-processing cross-encoder check
from sentence_transformers import CrossEncoder
nli = CrossEncoder('cross-encoder/nli-deberta-v3-small')
score = nli.predict([[context, answer]])  # entailment score
Topic Quiz · 1 questions

Test your understanding before moving on

1. What does LoRA fine-tuning do differently from full fine-tuning?
💡 LoRA (Low-Rank Adaptation) inserts tiny trainable adapter matrices — far cheaper and faster than full fine-tuning.