Ch 1: What Is RAG & Why It Matters

Ch 1 — What Is RAG — Under the Hood

The original paper, DPR, pipeline architecture, and the RAG taxonomy

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

AThe Original RAG Paper (Lewis et al. 2020)Where it all started

article

RAG PaperMeta AI, 2020
Lewis et al.

uses

DPR RetrieverDense Passage
Retrieval

feeds

smart_toy

BART GeneratorSeq2seq model
generates answer

outputs

verified

AnswerGrounded in
retrieved passages

arrow_downward Two RAG variants from the paper

BRAG-Sequence vs RAG-TokenTwo marginalization strategies

view_stream

RAG-SequenceOne doc per
full generation

grid_view

RAG-TokenDifferent doc
per token

evolved to

auto_awesome

Modern RAGIn-context learning
with retrieved docs

arrow_downward Dense Passage Retrieval (DPR)

CDense Passage Retrieval (Karpukhin et al. 2020)The retriever that made RAG possible

chat

Query EncoderBERT encodes
the question

q vector

calculate

Dot Productsim(q, d) =
q · d

d vector

description

Passage EncoderBERT encodes
each passage

arrow_downward The modern RAG pipeline architecture

DModern RAG PipelineHow RAG works today (2024–2026)

description

Load & ChunkSplit docs into
passages

embed

tag

Embedtext-embedding-3
BGE, E5, GTE

store

database

Vector StorePinecone, Qdrant
pgvector, Chroma

retrieve

smart_toy

LLM + ContextGPT-4o, Claude
Llama, Gemini

arrow_downward The RAG taxonomy: Naive → Advanced → Modular

ERAG TaxonomyFrom Gao et al. 2024 survey

looks_one

Naive RAGIndex → Retrieve
→ Generate

improves

looks_two

Advanced RAGPre/post retrieval
optimization

extends

looks_3

Modular RAGComposable modules
routing, loops

arrow_downward Key metrics and evaluation

FHow RAG Quality Is MeasuredPreview of evaluation (detailed in Ch 11)

fact_check

FaithfulnessIs the answer
supported by docs?

target

RelevancyDoes the answer
address the query?

Context RecallDid retrieval find
the right docs?