Ch 1 — What Is Fine-Tuning — Under the Hood

Transfer learning, the training loop, loss functions, and learning rate schedules
Under the Hood
-
Click play or press Space to begin...
Step- / 10
ATransfer Learning FoundationWhy fine-tuning works at all
1
language
Pre-Trained
General knowledge
transfer
1
tune
Fine-Tune
Specialize weights
result
1
star
Specialized
Your task model
2
functionsLoss function: cross-entropy over next-token prediction, masked to response tokens only
BThe Training LoopForward pass, loss, backward pass, optimizer step
3
input
Batch
Tokenized examples
forward
3
smart_toy
Model
Predict next token
loss
3
trending_down
Backprop
Compute gradients
update
3
update
Optimizer
AdamW step
4
memoryMemory: weights (fp16) + optimizer states (2x fp32) + gradients (fp16) + activations
CLearning Rate & HyperparametersThe knobs that control training quality
5
show_chart
LR Schedule
Warmup + cosine
decay
5
speed
Batch Size
Effective batch
tune
5
repeat
Epochs
1-3 typical
6
warningCatastrophic forgetting: why it happens and how LoRA, low LR, and data mixing prevent it
DSupervised Fine-Tuning (SFT) InternalsHow HuggingFace TRL SFTTrainer works
7
dataset
Dataset
Instruction pairs
format
7
code
Chat Template
Jinja2 format
tokenize
7
token
Tokens
Input IDs + labels
8
visibility_offLabel masking: only compute loss on assistant response tokens, not on prompt tokens
EOpenAI Fine-Tuning APIThe managed alternative to self-hosted training
9
cloud_upload
Upload JSONL
Messages format
train
9
model_training
Train
OpenAI servers
serve
9
cloud
Deploy
ft:gpt-4o-mini:...
FEvaluation After Fine-TuningHow to know if fine-tuning actually helped
10
science
Eval Set
Held-out test data
compare
10
compare
Base vs Fine-Tuned
Side-by-side
decide
10
verified
Ship or Iterate
Deploy if better
1
Title