13
“Attention Is All You Need” (2017) — the single architecture that powers the entire GenAI revolution.
- Self-attention lets the model weigh the importance of every word relative to every other word
- BERT (encoder, understanding) vs. GPT (decoder, generation) — same architecture, different training
- Foundation model economics: train once ($100M+), deploy millions of times
14
Three-stage training pipeline: pre-train on the internet, fine-tune for helpfulness, align with human values.
- Pipeline: pre-training → SFT → RLHF. Each stage shapes different capabilities
- Hallucination rate: 3% at best. The “Reasoning Paradox” — models that reason better hallucinate more confidently
- Landscape: GPT, Claude, Gemini, LLaMA, DeepSeek, xAI — no single winner
15
Five levels of customization — from API calls to training from scratch. Most enterprises need levels 1–3.
- LoRA/PEFT fine-tuning: $500–$2K. Enterprise fine-tuning: $12K–$180K
- Knowledge distillation reduces inference costs by 25×
- Self-hosting break-even: ~10M tokens/day. Below that, API is cheaper
16
Not a trick — a skill. Good prompting delivers 89% task time savings. Bad prompting loses 40% of potential value.
- Core techniques: zero-shot, few-shot, Chain-of-Thought, system prompts, ReAct
- Seven prompt components: role, context, task, format, constraints, examples, tone
- Context engineering — the emerging discipline of managing what information reaches the model
17
AI that sees, hears, reads, and generates across all modalities — text, image, audio, video.
- $4.5B market in 2025, growing 35%+ CAGR to $11–23B by 2030
- Image generation: Midjourney produces 12M images/day, 90% content creation time reduction
- Deepfake risks and copyright uncertainty are the key governance challenges
Act IV Bottom Line: Generative AI is the most visible AI revolution, but also the most overhyped. Hallucinations are real, costs are significant, and the technology changes quarterly. Success requires clear use cases, robust evaluation, and a multi-model strategy.