Ch 1: What Is Fine-Tuning — Under the Hood

Ch 1 — What Is Fine-Tuning — Under the Hood

Transfer learning, the training loop, loss functions, and learning rate schedules

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

ATransfer Learning FoundationWhy fine-tuning works at all

language

Pre-Trained

General knowledge

transfer

tune

Fine-Tune

Specialize weights

result

star

Specialized

Your task model

functionsLoss function: cross-entropy over next-token prediction, masked to response tokens only

BThe Training LoopForward pass, loss, backward pass, optimizer step

input

Batch

Tokenized examples

forward

smart_toy

Model

Predict next token

loss

trending_down

Backprop

Compute gradients

update

Optimizer

AdamW step

memoryMemory: weights (fp16) + optimizer states (2x fp32) + gradients (fp16) + activations

CLearning Rate & HyperparametersThe knobs that control training quality

show_chart

LR Schedule

Warmup + cosine

decay

speed

Batch Size

Effective batch

tune

repeat

Epochs

1-3 typical

warningCatastrophic forgetting: why it happens and how LoRA, low LR, and data mixing prevent it

DSupervised Fine-Tuning (SFT) InternalsHow HuggingFace TRL SFTTrainer works

dataset

Dataset

Instruction pairs

format

code

Chat Template

Jinja2 format

tokenize

token

Tokens

Input IDs + labels

visibility_offLabel masking: only compute loss on assistant response tokens, not on prompt tokens

EOpenAI Fine-Tuning APIThe managed alternative to self-hosted training

cloud_upload

Upload JSONL

Messages format

train

model_training

Train

OpenAI servers

serve

cloud

Deploy

ft:gpt-4o-mini:...

FEvaluation After Fine-TuningHow to know if fine-tuning actually helped

science

Eval Set

Held-out test data

compare

Base vs Fine-Tuned

Side-by-side

decide

verified

Ship or Iterate

Deploy if better