Ch 6: Retrieval Strategies

Ch 6 — Retrieval Strategies — Under the Hood

BM25 internals, RRF math, cross-encoder architecture, and self-query parsing

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

ABM25 Scoring InternalsThe math behind keyword search

text_fields

Tokenize Query

Split into terms

IDF

functions

BM25 Formula

TF × IDF scoring

rank

sort

Ranked Results

Sorted by BM25 score

data_arraySparse Vectors: SPLADE learns term weights via MLM, producing sparse embeddings for neural keyword search

BDense Retrieval InternalsBi-encoder architecture and Maximum Marginal Relevance

call_split

Bi-Encoder

Separate q & d encoders

embed

scatter_plot

ANN Search

HNSW top-k

diversify

view_list

MMR

Maximal Marginal Relevance

tuneSimilarity Thresholds: score_threshold filtering removes low-confidence results before they reach the LLM

CHybrid Search & Reciprocal Rank FusionMerging dense and sparse result lists

scatter_plot

Dense List

Ranked by cosine sim

RRF

merge

Rank Fusion

1/(k + rank)

RRF

text_fields

Sparse List

Ranked by BM25

balanceWeighted Fusion: Convex combination with alpha (Weaviate) or explicit weights (LangChain EnsembleRetriever)

DCross-Encoder RerankingDeep relevance scoring with full attention

join_inner

Concat [q; d]

Query + doc as input

transformer

psychology

Full Attention

Cross q-d attention

classify

speed

Relevance Score

Single float output

compare_arrowsBi-Encoder vs Cross-Encoder: O(1) per doc vs O(n) per doc — why reranking is always a second pass

ESelf-Query Retrieval & RoutingLLM-powered filter extraction and index selection

smart_toy

LLM Parser

Extract query + filters

structured

filter_alt

Filter Object

Metadata conditions

route

alt_route

Index Router

Select best index

FLatency Budget & Retrieval EvaluationMeasuring and optimizing the retrieval pipeline

timer

Latency Budget

Embed + Search + Rerank

metrics

monitoring

Recall@k / MRR

Retrieval quality metrics

optimize

verified

Production Config

Recommended pipeline