Ch 10 — Large Language Models
BPE internals, scaling law math, RLHF pipeline, LoRA, quantization, and inference optimization
Under the Hood
-
Click play or press Space to begin the deep dive...
Zone ATokenization — BPE AlgorithmSteps 1–2
1
token
BPE Training
Build vocabulary from corpus
2
data_array
Encoding
Text → token IDs → embeddings
arrow_downward Tokens ready → pretraining
3
Zone BPretraining & Scaling LawsSteps 3–5
3
functions
Loss Function
Cross-entropy over vocab
4
trending_up
Scaling Laws
Power law formulas
5
memory
Training Infra
Distributed, mixed precision
arrow_downward Base model → alignment
6
Zone CAlignment — SFT, RLHF, DPOSteps 6–7
6
school
SFT + Reward
Instruction tuning + RM
7
psychology
RLHF & DPO
PPO vs direct preference
arrow_downward Efficient adaptation
8
Zone DLoRA, Quantization & Efficient Fine-TuningSteps 8–9
8
compress
LoRA
Low-rank adaptation
9
speed
Quantization
INT8, INT4, GPTQ, AWQ
arrow_downward Inference optimization
10
Zone EInference — Sampling & ServingStep 10
10
rocket_launch
Serving Stack
Batching, speculative decode