Unique Multimodal Risks
Multimodal AI creates risks that text-only AI doesn’t have:
• Deepfakes: Realistic fake images, video, and audio of real people
• Non-consensual imagery: Generating intimate images of real people
• Misinformation at scale: Fake photos/videos of events that never happened
• Identity theft: Cloning someone’s voice and face for fraud
• Surveillance: AI-powered facial recognition and tracking
• Bias amplification: Visual stereotypes embedded in training data
Scale of the Problem
• 96% of deepfakes are non-consensual intimate imagery (2023 study)
• 500% increase in deepfake fraud attempts since 2023
• $25B estimated losses from AI-generated fraud by 2027
• Election interference: AI-generated images of candidates in fabricated scenarios
• Voice cloning scams: 3 seconds of audio is enough to clone a voice convincingly
Key insight: The fundamental challenge: the same technology that enables creative expression, accessibility, and productivity also enables deception, harassment, and fraud. There is no technical solution that preserves all benefits while eliminating all harms.