shield

AI Security — Deep Dive

From threat landscape to production hardening. Each chapter: visual journey overview + under-the-hood deep dive.
Co-Created by Kiran Shirol and Claude
Core Topics OWASP Top 10 Guardrails Red Team LLM Security
home Learning Portal play_arrow Start Learning summarize Key Insights dictionary Glossary 14 chapters · Each with High Level + Under the Hood
Offense

Attack Surface & Threat Landscape

The attacks every AI practitioner must understand — from prompt injection to adversarial ML.
1
public
The AI Security Landscape
OWASP Top 10 for LLMs, MITRE ATLAS, the AI Incident Database, and the CIA triad applied to AI.
2
dangerous
Prompt Injection — The #1 Threat
Direct and indirect injection, the confused deputy problem, and real-world incidents.
3
lock_open
Jailbreaking & Guardrail Bypass
Crescendo attacks, many-shot jailbreaking, DAN role-play, and encoded payloads.
4
science
Data Poisoning & Training-Time Attacks
Sleeper agents, PickleRAT supply chain attacks, safetensors, and model signing.
5
bug_report
Adversarial Machine Learning
FGSM, PGD, C&W attacks, evasion of safety classifiers, and transferability.
Defense

Guardrails, RAG, Agents & MCP Security

Securing the components of modern AI systems — from input filtering to tool-calling sandboxes.
6
filter_alt
Input Guardrails & Output Filtering
NeMo Guardrails, LLM Guard, canary tokens, PII detection, and layered defense.
7
search
Securing RAG Pipelines
CPA-RAG attacks, embedded threats, jamming attacks, RAGPart/RAGMask defenses.
8
smart_toy
Securing Agents & Tool Calling
AgentXploit, STAC tool chaining, WASM sandboxing, and capability-based access control.
9
hub
Securing MCP & External Integrations
MCPoison, tool poisoning, rug pull attacks, RSA manifest signing, and runtime guardrails.
Governance

Privacy, Red Teaming & Compliance

Data leakage, adversarial testing methodologies, and the regulatory landscape.
10
privacy_tip
Privacy, Data Leakage & Model Extraction
Membership inference, differential privacy, PII leakage, model stealing, GDPR/CCPA.
11
target
Red Teaming AI Systems
Garak, PyRIT, PromptFoo, MITRE ATLAS methodology, prompt fuzzing, and bug bounties.
12
gavel
AI Governance, Compliance & Risk
EU AI Act, NIST AI RMF, ISO 42001, model cards, incident response, and responsible disclosure.
Hardening

Secure Architecture & Production Hardening

Putting it all together — zero-trust patterns, defense in depth, and the full security stack.
13
architecture
Secure AI Architecture Patterns
Zero-trust for AI, layer separation, API gateways, rate limiting, and secrets management.
14
security
Production Hardening & Defense in Depth
The full security stack, continuous red teaming, incident response, and future outlook.