The Analogy
An LLM is like a very confident storyteller who never says “I don’t know.” Ask about a real paper and it might cite the right authors with the wrong title, or invent a plausible-sounding paper that doesn’t exist. This happens because the model is trained to predict the most likely next token, not the most truthful next token. Plausible-sounding text is rewarded even if it’s factually wrong.
Key insight: Hallucination is not a bug that can be fully fixed — it’s a fundamental property of how LLMs work. The model generates text by sampling from probability distributions (Ch 9). It has no internal fact-checker, no database to verify against, and no concept of “truth” separate from “what text usually follows this context.” Mitigations: RAG (Ch 10), chain-of-thought verification, and training models to say “I don’t know” (alignment, Ch 8).
Types of Hallucination
# Hallucination types:
# 1. Factual: wrong facts stated confidently
# "Einstein was born in 1880" (actually 1879)
# 2. Fabrication: invented entities
# "The Smith et al. (2023) paper showed..."
# (paper doesn't exist)
# 3. Inconsistency: contradicts itself
# "X is true" then later "X is false"
# 4. Unfaithful: contradicts provided context
# Given a document, summarizes incorrectly
# Mitigation strategies:
# RAG: ground responses in real documents
# CoT: verify reasoning step by step
# Low temperature: reduce randomness
# RLHF: train to say "I don't know"
# Citations: require source attribution
# Human review: for high-stakes outputs