Ch 1 — What Is RAG — Under the Hood

The original paper, DPR, pipeline architecture, and the RAG taxonomy
Under the Hood
-
Click play or press Space to begin...
Step- / 10
AThe Original RAG Paper (Lewis et al. 2020)Where it all started
1
article
RAG PaperMeta AI, 2020
Lewis et al.
uses
search
DPR RetrieverDense Passage
Retrieval
feeds
2
smart_toy
BART GeneratorSeq2seq model
generates answer
outputs
verified
AnswerGrounded in
retrieved passages
3
arrow_downward Two RAG variants from the paper
BRAG-Sequence vs RAG-TokenTwo marginalization strategies
view_stream
RAG-SequenceOne doc per
full generation
vs
grid_view
RAG-TokenDifferent doc
per token
evolved to
4
auto_awesome
Modern RAGIn-context learning
with retrieved docs
5
arrow_downward Dense Passage Retrieval (DPR)
CDense Passage Retrieval (Karpukhin et al. 2020)The retriever that made RAG possible
chat
Query EncoderBERT encodes
the question
q vector
calculate
Dot Productsim(q, d) =
q · d
d vector
description
Passage EncoderBERT encodes
each passage
6
arrow_downward The modern RAG pipeline architecture
DModern RAG PipelineHow RAG works today (2024–2026)
description
Load & ChunkSplit docs into
passages
embed
tag
Embedtext-embedding-3
BGE, E5, GTE
store
7
database
Vector StorePinecone, Qdrant
pgvector, Chroma
retrieve
smart_toy
LLM + ContextGPT-4o, Claude
Llama, Gemini
8
arrow_downward The RAG taxonomy: Naive → Advanced → Modular
ERAG TaxonomyFrom Gao et al. 2024 survey
looks_one
Naive RAGIndex → Retrieve
→ Generate
improves
looks_two
Advanced RAGPre/post retrieval
optimization
extends
9
looks_3
Modular RAGComposable modules
routing, loops
10
arrow_downward Key metrics and evaluation
FHow RAG Quality Is MeasuredPreview of evaluation (detailed in Ch 11)
fact_check
FaithfulnessIs the answer
supported by docs?
+
target
RelevancyDoes the answer
address the query?
+
search
Context RecallDid retrieval find
the right docs?
1
Detail