Ch 10: Large Language Models

Ch 10 — Large Language Models

BPE internals, scaling law math, RLHF pipeline, LoRA, quantization, and inference optimization

Under the Hood

Click play or press Space to begin the deep dive...

Step- / 10

Zone ATokenization — BPE AlgorithmSteps 1–2

token

BPE Training

Build vocabulary from corpus

data_array

Encoding

Text → token IDs → embeddings

arrow_downward Tokens ready → pretraining

Zone BPretraining & Scaling LawsSteps 3–5

functions

Loss Function

Cross-entropy over vocab

trending_up

Scaling Laws

Power law formulas

memory

Training Infra

Distributed, mixed precision

arrow_downward Base model → alignment

Zone CAlignment — SFT, RLHF, DPOSteps 6–7

school

SFT + Reward

Instruction tuning + RM

psychology

RLHF & DPO

PPO vs direct preference

arrow_downward Efficient adaptation

Zone DLoRA, Quantization & Efficient Fine-TuningSteps 8–9

compress

LoRA

Low-rank adaptation

speed

Quantization

INT8, INT4, GPTQ, AWQ

arrow_downward Inference optimization

Zone EInference — Sampling & ServingStep 10

rocket_launch

Serving Stack

Batching, speculative decode