Ch 7: Query Transformation

Ch 7 — Query Transformation — Under the Hood

Prompt templates, HyDE math, decomposition chains, and RAG-Fusion

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

AHistory-Aware Query RewritingResolving conversational context

history

Chat History

Prior turns

+ query

smart_toy

LLM Rewrite

Standalone query

output

edit_note

Rewritten Query

Context-resolved

repeatMulti-Query Generation: LLM produces 3-5 query variations for broader recall

BHyDE: Hypothetical Document EmbeddingsGao et al., 2022 — embedding hypothetical answers

chat

User Query

Natural language

generate

auto_awesome

Hypothetical Doc

LLM-generated

embed

scatter_plot

HyDE Vector

Search with this

functionsHyDE Math: E[embed(hypo)] closer to embed(real_doc) than embed(query) in vector space

CSub-Question DecompositionBreaking complex queries into retrievable parts

help

Complex Query

Multi-part question

decompose

call_split

Sub-Questions

2-4 simpler parts

retrieve each

merge

Merge Results

Synthesize answer

alt_routeRouting: Each sub-question can target a different index or retriever

DStep-Back Prompting & RAG-FusionBroadening context and fusing multi-query results

step_over

Step-Back

Broader question

parallel

Dual Retrieval

Original + step-back

fuse

merge

RAG-Fusion

RRF across queries

psychologyCorrective RAG (CRAG): LLM grades retrieved docs, triggers web search if relevance is low

EIterative & Adaptive RetrievalRetrieve, evaluate, refine, retrieve again

Initial Retrieve

First pass

evaluate

grading

LLM Judge

Sufficient context?

refine

refresh

Refine Query

Retrieve again

FProduction Pipeline & Cost AnalysisPutting it all together with latency and cost budgets

timer

Latency Budget

Per-strategy cost

compose

account_tree

Decision Tree

Which strategy when

implement

verified

Recommended

Production config