Ch 10: Privacy, Data Leakage & Model Extraction

Ch 10 — Privacy, Data Leakage & Model Extraction — Under the Hood

Training data extraction, membership inference, DP-SGD, Presidio, machine unlearning

Under the Hood

Click play or press Space to begin. Click any node for deep-dive details...

Step- / 10

ATraining Data ExtractionOWASP LLM02:2025 — Carlini et al. 2023 divergence attack

dataset

Memorized DataPII, code, secrets
in model weights

Divergence AttackRepeat tokens until
model diverges

article

Extracted DataVerbatim training
samples recovered

arrow_downward Membership inference & model extraction attacks

BMembership Inference & Model ExtractionShokri et al. 2017, Carlini ICML 2024

person_search

Membership InferenceWas this data
in training set?

content_copy

Model ExtractionSteal model via
API queries

warning

Samsung IncidentApr 2023: source code
leaked to ChatGPT

arrow_downward Defense: differential privacy & PII detection

CDifferential Privacy & PII DetectionDP-SGD, Microsoft Presidio, anonymization

blur_on

DP-SGDNoisy gradients
privacy guarantee

badge

PresidioMicrosoft PII
detection & redaction

shuffle

AnonymizationReplace PII with
synthetic tokens

arrow_downward Machine unlearning & right to erasure

DMachine Unlearning & Regulatory ComplianceGDPR right to erasure, EU AI Act, gradient subtraction

delete_sweep

UnlearningRemove specific data
from trained model

gavel

GDPR / EU AI ActRight to erasure
7% revenue penalties

arrow_downward Complete privacy defense architecture

EPrivacy Defense ArchitectureEnd-to-end privacy pipeline

monitoring

Output MonitoringDetect memorized
content in responses

layers

Defense StackFull privacy
pipeline