Ch 1 — What Is RAG — Under the Hood
The original paper, DPR, pipeline architecture, and the RAG taxonomy
Under the Hood
-
Click play or press Space to begin...
AThe Original RAG Paper (Lewis et al. 2020)Where it all started
1article
RAG PaperMeta AI, 2020
Lewis et al.
uses
search
DPR RetrieverDense Passage
Retrieval
feeds
2smart_toy
BART GeneratorSeq2seq model
generates answer
outputs
verified
AnswerGrounded in
retrieved passages
3arrow_downward Two RAG variants from the paper
BRAG-Sequence vs RAG-TokenTwo marginalization strategies
view_stream
RAG-SequenceOne doc per
full generation
vs
grid_view
RAG-TokenDifferent doc
per token
evolved to
4auto_awesome
Modern RAGIn-context learning
with retrieved docs
5arrow_downward Dense Passage Retrieval (DPR)
CDense Passage Retrieval (Karpukhin et al. 2020)The retriever that made RAG possible
chat
Query EncoderBERT encodes
the question
q vector
calculate
Dot Productsim(q, d) =
q · d
d vector
description
Passage EncoderBERT encodes
each passage
6arrow_downward The modern RAG pipeline architecture
DModern RAG PipelineHow RAG works today (2024–2026)
description
Load & ChunkSplit docs into
passages
embed
tag
Embedtext-embedding-3
BGE, E5, GTE
store
7database
Vector StorePinecone, Qdrant
pgvector, Chroma
retrieve
smart_toy
LLM + ContextGPT-4o, Claude
Llama, Gemini
8arrow_downward The RAG taxonomy: Naive → Advanced → Modular
ERAG TaxonomyFrom Gao et al. 2024 survey
looks_one
Naive RAGIndex → Retrieve
→ Generate
improves
looks_two
Advanced RAGPre/post retrieval
optimization
extends
9looks_3
Modular RAGComposable modules
routing, loops
10arrow_downward Key metrics and evaluation
FHow RAG Quality Is MeasuredPreview of evaluation (detailed in Ch 11)
fact_check
FaithfulnessIs the answer
supported by docs?
+
target
RelevancyDoes the answer
address the query?
+
search
Context RecallDid retrieval find
the right docs?