Ch 19 — AI Product Ethics & Safety

Bias, fairness, transparency, and the regulatory landscape every AI PM must navigate.
High Level
gavel
Regulation
arrow_forward
balance
Bias
arrow_forward
visibility
Transparency
arrow_forward
privacy_tip
Privacy
arrow_forward
security
Safety
arrow_forward
verified_user
Framework
-
Click play or press Space to begin...
Step- / 8
gavel
The Regulatory Landscape
The EU AI Act is law. Others are following. Compliance is no longer optional.
The EU AI Act
The world’s first comprehensive AI law entered force in August 2024. It applies to any company whose AI touches EU citizens, regardless of where the company is headquartered.

Phased enforcement:
Feb 2025: Prohibited practices banned (social scoring, real-time biometric surveillance in public spaces, manipulation of vulnerable groups)
Aug 2025: General-purpose AI obligations active (transparency, documentation for foundation models)
Aug 2026: High-risk AI obligations fully enforceable (risk management systems, data governance, human oversight, technical documentation)

Penalties: Up to €35 million or 7% of global annual turnover. Documented compliance effort is a formal mitigating factor — meaning the effort to comply matters even if you fall short.
Risk Categories
The EU AI Act classifies AI systems by risk level:

Unacceptable risk (banned):
Social scoring, manipulative AI, real-time biometric surveillance (with narrow exceptions).

High risk (heavy regulation):
AI in hiring, credit scoring, healthcare, law enforcement, education, critical infrastructure. Requires risk management systems, data governance, human oversight, and technical documentation.

Limited risk (transparency obligations):
Chatbots, deepfakes, emotion recognition. Must disclose that users are interacting with AI.

Minimal risk (no specific obligations):
Spam filters, AI in video games, most recommendation systems. Still subject to general consumer protection law.
PM action: Classify your AI product’s risk level under the EU AI Act. If high-risk, start compliance work now — the August 2026 deadline requires risk management systems, technical documentation, and post-market monitoring to be in place, tested, and defensible. If limited risk, ensure your product clearly discloses AI involvement.
balance
Bias & Fairness
AI doesn’t create bias — it amplifies the bias already in the data
Types of Bias
Data quality bias:
Training data contains errors, inconsistencies, or gaps that skew the model’s understanding.

Sampling bias:
Training data doesn’t represent the full population. A hiring model trained mostly on resumes from one demographic will disadvantage others.

Historical bias:
Training data reflects past discrimination. A lending model trained on historical loan decisions will perpetuate past biases against minority groups.

Algorithmic bias:
The model architecture or optimization objective systematically favors certain outcomes. Optimizing for engagement can amplify sensational or divisive content.

Interpretation bias:
Humans interpreting AI outputs apply their own biases. A risk score of “7/10” may be treated differently depending on the subject’s demographics.
Mitigating Bias
1. Audit the training data.
Analyze demographic representation. Identify gaps and over-representations. Document known biases in the data.

2. Test across subgroups.
Evaluate model performance separately for different demographic groups, use cases, and input types. Equal overall accuracy can mask disparate performance across groups.

3. Use fairness metrics.
Demographic parity (equal positive prediction rates), equalized odds (equal true positive and false positive rates), and individual fairness (similar individuals get similar outcomes).

4. Implement bias monitoring.
Bias isn’t a one-time fix. Monitor fairness metrics continuously in production. User populations and input distributions change over time.

5. Create a diverse review team.
People with different backgrounds catch different biases. A homogeneous team will have blind spots.
The PM’s responsibility: The PM decides which fairness metrics to track, what thresholds are acceptable, and how to handle trade-offs (e.g., improving fairness for one group may reduce accuracy for another). These are product decisions, not engineering decisions. Own them explicitly.
visibility
Transparency & Explainability
Users have a right to know they’re interacting with AI — and to understand why it made a decision
Disclosure Requirements
AI disclosure:
Users must know when they’re interacting with an AI system. This is a legal requirement under the EU AI Act for chatbots and a best practice everywhere. “You’re chatting with our AI assistant” is sufficient.

Deepfake disclosure:
AI-generated or AI-manipulated content (images, audio, video) must be labeled as such.

Decision disclosure:
For high-risk decisions (hiring, lending, insurance), users have the right to know that AI was involved and to understand the key factors in the decision.

Data disclosure:
Users should know what data the AI uses to make decisions. “This recommendation is based on your purchase history and browsing behavior.”
Explainability in Practice
For users:
Explain AI decisions in plain language. “We recommended this product because customers with similar preferences also purchased it.” Not technical details — the reasoning in terms the user understands.

For regulators:
Technical documentation of the model’s architecture, training data, evaluation results, known limitations, and risk mitigations. This is the “model card” concept.

For internal teams:
Detailed tracing of individual decisions: what data was used, what model version, what prompt, what output. Needed for debugging, auditing, and incident investigation.

The explainability trade-off:
More complex models (deep learning, large LLMs) are harder to explain than simpler models (decision trees, linear regression). The PM must balance model performance against explainability requirements. For high-risk domains, explainability may be a hard requirement that constrains model choice.
The transparency principle: When in doubt, disclose more, not less. Users who feel deceived by hidden AI involvement lose trust permanently. Users who are told upfront that AI is involved and given tools to understand and override it develop calibrated trust. Transparency is a competitive advantage, not a liability.
privacy_tip
Privacy & Data Protection
AI is hungry for data. Privacy law constrains what data you can feed it.
Key Privacy Concerns
Training data privacy:
Was the model trained on personal data? Did the data subjects consent? Can the model be prompted to reproduce personal information from its training data? These questions have legal implications under GDPR and similar laws.

Input data privacy:
Users may include personal information, trade secrets, or confidential data in their prompts. Where does this data go? Is it stored? Is it used for training? Is it sent to a third-party API?

Output data risks:
The AI might inadvertently include personal information in its outputs — especially if the knowledge base contains PII or if the model memorized training data.

Data minimization:
GDPR requires collecting only the data necessary for the stated purpose. AI’s appetite for data conflicts with this principle. The PM must define what data is truly needed vs. what would be “nice to have.”
Privacy Safeguards
1. Data processing agreements.
If using third-party AI providers, ensure your DPA covers how user data is handled, stored, and (not) used for training.

2. PII detection and redaction.
Automatically detect and redact personal information from AI inputs and outputs. This prevents accidental data leakage.

3. Data retention policies.
Define how long AI interaction data is stored. Implement automated deletion. Allow users to request deletion of their data.

4. Anonymization.
When using interaction data for improvement, anonymize it first. Remove identifiers, aggregate patterns, and use differential privacy techniques where possible.

5. User controls.
Give users control over their data: opt-out of data collection, delete conversation history, download their data. These controls build trust and satisfy regulatory requirements.
The provider question: When using OpenAI, Anthropic, or Google APIs, know exactly where user data goes. Does the provider use your data for training? (Most enterprise plans: no.) Where is data stored geographically? (Matters for GDPR.) What happens if the provider is breached? Document these answers in your privacy impact assessment.
security
Safety Engineering
Preventing harm: from content safety to autonomous action guardrails
Content Safety
Harmful content prevention:
The AI must not generate content that is violent, sexually explicit, hateful, or promotes illegal activities. Implement content filters on both inputs and outputs.

Misinformation prevention:
The AI must not present fabricated information as fact. Grounding in verified sources (RAG), hallucination detection, and source citation are the primary defenses.

Manipulation prevention:
The AI must not be used to manipulate vulnerable users, create deceptive content, or impersonate real people without disclosure.

Self-harm and crisis content:
If users express suicidal ideation or self-harm intent, the AI must respond appropriately: provide crisis resources, avoid harmful advice, and escalate to human support.
Action Safety (Agentic AI)
For AI systems that take actions (not just generate text), additional safety layers are critical:

Scope limitation:
Define exactly what actions the AI can take. Whitelist permitted actions rather than blacklisting prohibited ones.

Confirmation gates:
Require human confirmation before irreversible actions (sending emails, making purchases, modifying data).

Rate limiting:
Limit the number and frequency of actions the AI can take. Prevent runaway loops where the AI takes hundreds of actions in seconds.

Rollback capability:
Every action the AI takes should be reversible. If the AI makes a mistake, the user (or the system) can undo it.

Audit trail:
Log every action with full context: what was requested, what was done, what data was accessed, what the outcome was.
The safety hierarchy: 1. Prevent harm (content filters, scope limits). 2. Detect harm (monitoring, user reports). 3. Mitigate harm (kill switches, rollback). 4. Learn from harm (incident analysis, improved guardrails). All four layers must be in place. No single layer is sufficient.
people
Human Oversight
AI should augment human decision-making, not replace human accountability
The Oversight Principle
The EU AI Act’s first ethical pillar is human agency and oversight: AI systems must empower rather than subordinate humans, with meaningful control retained by human decision-makers.

This means:
• Humans must be able to understand what the AI is doing
• Humans must be able to override AI decisions
• Humans must be able to intervene when the AI behaves unexpectedly
• Humans must remain accountable for outcomes, even when AI is involved

The key word is “meaningful.” A human rubber-stamping AI decisions without review is not meaningful oversight. The human must have the information, tools, and authority to actually evaluate and override.
Oversight in Practice
Human-in-the-loop:
Human reviews and approves every AI output before it reaches the end user. Highest safety, lowest efficiency. Use for high-stakes decisions (medical diagnosis, legal advice, financial approvals).

Human-on-the-loop:
AI acts autonomously, but a human monitors in real time and can intervene. Moderate safety, moderate efficiency. Use for customer support, content moderation, routine decisions.

Human-over-the-loop:
AI acts autonomously. Humans set policies, review aggregate performance, and adjust parameters. Lowest direct oversight, highest efficiency. Use for low-stakes, high-volume tasks (spam filtering, content recommendations).

The PM decides which oversight model applies to each AI capability based on the risk level, regulatory requirements, and user expectations.
The accountability rule: When something goes wrong, a human must be accountable. “The AI did it” is never an acceptable answer to regulators, users, or the press. Define clear ownership: who is responsible for the AI’s behavior? Who reviews its outputs? Who responds to incidents? Document this chain of accountability.
eco
Sustainability & Societal Impact
The broader responsibilities of building AI products
Environmental Impact
Energy consumption:
Training large AI models consumes significant energy. A single GPT-4-scale training run can consume as much electricity as 100 US households use in a year. Inference (running the model) also has a meaningful carbon footprint at scale.

PM considerations:
• Choose the smallest model that meets quality requirements (smaller = less energy)
• Optimize token usage (fewer tokens = less compute = less energy)
• Use caching to avoid redundant computation
• Consider the environmental cost when choosing between on-premise and cloud deployment
• Track and report AI energy consumption as part of ESG reporting
Workforce Impact
Job displacement vs. augmentation:
AI products that automate tasks affect the people who currently perform those tasks. The PM should consider:

• Is the AI replacing jobs or augmenting them? Frame the product accordingly.
• What happens to the people whose work is automated? Is there a transition plan?
• Are we creating new roles (AI trainers, evaluators, oversight) as we automate old ones?
• How do we communicate workforce changes transparently?

Skill development:
AI products should help users develop skills, not create dependency. A writing assistant that makes users better writers is more ethical than one that makes users unable to write without it.
The broader view: Ethics isn’t just about compliance. It’s about building AI products that create genuine value for users and society. The PM who considers environmental impact, workforce effects, and societal consequences builds products that are sustainable in every sense — commercially, environmentally, and socially.
verified_user
The Responsible AI Framework
A practical checklist for building AI products you can be proud of
Before Building
□ Risk classification complete
Know your product’s risk level under the EU AI Act and other applicable regulations.

□ Ethical review conducted
Cross-functional team has reviewed the product for potential harms, biases, and unintended consequences.

□ Data privacy assessment done
Data sources, processing, storage, and third-party sharing are documented and compliant.

□ Fairness criteria defined
Which fairness metrics will you track? What thresholds are acceptable? How will you handle trade-offs?
During Development
□ Bias testing across subgroups
Model performance evaluated separately for different demographic groups and use cases.

□ Safety guardrails implemented
Content filters, scope limits, confirmation gates, and rollback capability in place.

□ Human oversight model defined
In-the-loop, on-the-loop, or over-the-loop — with clear accountability chain.
At Launch & Beyond
□ AI disclosure in place
Users know they’re interacting with AI. Decision factors are explainable.

□ User controls available
Opt-out, data deletion, conversation history management, human escalation.

□ Continuous bias monitoring
Fairness metrics tracked in production. Alerts on disparate performance.

□ Incident response for ethical issues
Process for handling bias reports, harmful content incidents, and privacy breaches.

□ Regular ethical review
Quarterly review of AI behavior, fairness metrics, user complaints, and regulatory updates.

□ Technical documentation maintained
Model cards, risk assessments, and compliance documentation kept current.
The bottom line: Ethics and safety are not constraints on innovation — they’re prerequisites for sustainable innovation. AI products that harm users, discriminate, or violate privacy face regulatory penalties, user backlash, and reputational damage that far exceed the cost of doing it right. The PM who builds responsible AI from the start builds products that last. The PM who treats ethics as an afterthought builds products that become liabilities.