FGSM on Repeat
PGD is essentially FGSM applied iteratively. Instead of one big step, it takes many small gradient steps, projecting back onto the allowed perturbation ball (ε-ball) after each step. This iterative approach finds much stronger adversarial examples than FGSM’s single step. Madry et al. framed adversarial robustness as a robust optimization problem: minθ maxδ∈Δ L(θ, x+δ, y).
The Standard Benchmark
PGD became the gold standard for evaluating adversarial defenses. If a defense can’t withstand PGD, it’s not considered robust. The paper also showed that adversarial training with PGD examples produces models with “significantly improved resistance to a wide range of adversarial attacks.” Code and pre-trained models were released publicly.
# PGD: iterative FGSM with projection
x_adv = x + random_start(epsilon) # random init
for i in range(num_steps):
loss = criterion(model(x_adv), y)
loss.backward()
# Small gradient step
x_adv = x_adv + alpha * x_adv.grad.sign()
# Project back onto ε-ball around x
x_adv = project(x_adv, x, epsilon)
x_adv = torch.clamp(x_adv, 0, 1)
# Typical: 40 steps, α=ε/4, ε=8/255
White-box requirement: Both FGSM and PGD require access to the model’s gradients (white-box). But adversarial examples often transfer — an attack crafted on model A frequently fools model B, enabling black-box attacks.