Ch 11 — Red Teaming & AI Pentesting — Under the Hood

Garak, PyRIT, DeepTeam, NIST AI RMF, DEF CON 31, building a red team program
Under the Hood
-
Click play or press Space to begin. Click any node for deep-dive details...
Step- / 10
AAI Red Teaming vs Traditional Pen TestingNon-deterministic targets, probabilistic attacks
1
compare_arrows
Key DifferencesDeterministic vs
probabilistic targets
target
Scope DefinitionModel, tools, RAG
agents, MCP
2
event
DEF CON 31AI Village Aug 2023
2,200+ participants
3
arrow_downward Automated tools: Garak LLM vulnerability scanner
BGarak: LLM Vulnerability ScannerNVIDIA — automated probe generation & detection
radar
Garak ProbesPrompt injection
jailbreak, toxicity
4
analytics
Garak DetectorsClassify model
responses as pass/fail
summarize
Garak ReportVulnerability scores
per category
5
arrow_downward Multi-turn: PyRIT orchestration framework
CPyRIT: Multi-Turn Red TeamingMicrosoft — Crescendo, TAP, Skeleton Key orchestration
chat
OrchestratorsCrescendo, TAP
Skeleton Key
6
smart_toy
Attacker LLMRed team model
generates attacks
gavel
ScorerLLM-as-Judge
evaluates success
7
arrow_downward DeepTeam & NIST AI RMF framework alignment
DDeepTeam & NIST AI RMFFramework-aligned scanning, ARIA program
checklist
DeepTeamConfident AI
40+ vulnerability types
8
account_balance
NIST AI RMFGovern, Map
Measure, Manage
science
ARIA ProgramNIST Fall 2024
public challenge
9
arrow_downward Building a red team program
EBuilding an AI Red Team ProgramTeam structure, cadence, CI/CD integration
groups
Team StructureSkills, roles
engagement model
10
layers
Red Team StackFull program
architecture