Ch 7 — Query Transformation — Under the Hood
Prompt templates, HyDE math, decomposition chains, and RAG-Fusion
Under the Hood
-
Click play or press Space to begin...
AHistory-Aware Query RewritingResolving conversational context
1
history
Chat History
Prior turns
+ query
1
smart_toy
LLM Rewrite
Standalone query
output
1
edit_note
Rewritten Query
Context-resolved
2
repeatMulti-Query Generation: LLM produces 3-5 query variations for broader recall
BHyDE: Hypothetical Document EmbeddingsGao et al., 2022 — embedding hypothetical answers
3
chat
User Query
Natural language
generate
3
auto_awesome
Hypothetical Doc
LLM-generated
embed
3
scatter_plot
HyDE Vector
Search with this
4
functionsHyDE Math: E[embed(hypo)] closer to embed(real_doc) than embed(query) in vector space
CSub-Question DecompositionBreaking complex queries into retrievable parts
5
help
Complex Query
Multi-part question
decompose
5
call_split
Sub-Questions
2-4 simpler parts
retrieve each
5
merge
Merge Results
Synthesize answer
6
alt_routeRouting: Each sub-question can target a different index or retriever
DStep-Back Prompting & RAG-FusionBroadening context and fusing multi-query results
7
step_over
Step-Back
Broader question
parallel
7
search
Dual Retrieval
Original + step-back
fuse
7
merge
RAG-Fusion
RRF across queries
8
psychologyCorrective RAG (CRAG): LLM grades retrieved docs, triggers web search if relevance is low
EIterative & Adaptive RetrievalRetrieve, evaluate, refine, retrieve again
9
search
Initial Retrieve
First pass
evaluate
9
grading
LLM Judge
Sufficient context?
refine
9
refresh
Refine Query
Retrieve again
FProduction Pipeline & Cost AnalysisPutting it all together with latency and cost budgets
10
timer
Latency Budget
Per-strategy cost
compose
10
account_tree
Decision Tree
Which strategy when
implement
10
verified
Recommended
Production config