Hard Constraints
Every AI PRD needs a section on absolute constraints — behaviors that are never acceptable regardless of other performance metrics:
Content safety:
• “The AI must not generate hate speech, violent content, or sexually explicit material.”
• “The AI must not reveal personally identifiable information from training data.”
Scope boundaries:
• “The AI must refuse requests outside its domain. A customer service bot must not provide medical, legal, or financial advice.”
• “The AI must not impersonate a human. It must identify itself as AI when asked.”
Factual constraints:
• “The AI must not fabricate citations, statistics, or quotes.”
• “When uncertain, the AI must say ‘I don’t know’ rather than guess.”
Guardrail Metrics
Beyond hard constraints, define guardrail metrics — metrics that must not degrade even as you optimize primary metrics:
Primary metric: Task completion rate (optimize this)
Guardrail metrics:
• Safety violation rate must stay <0.1%
• Hallucination rate must stay <5%
• Average response time must stay <2s
• Cost per query must stay <$0.05
Guardrails prevent the team from over-optimizing one dimension at the expense of others. A model that achieves 95% task completion but hallucinates 20% of the time has failed its guardrails.
The red team section: Include a “red team” requirement in the PRD: before launch, the product must be tested by adversarial users who deliberately try to make it fail. Prompt injection, jailbreaking, edge cases, offensive inputs. If you don’t test for abuse, users will find the vulnerabilities for you — publicly.