Ch 13 — UX Patterns for AI Products

Designing for uncertainty, building calibrated trust, and handling the 23% that goes wrong.
High Level
psychology
Trust
arrow_forward
tune
Confidence
arrow_forward
touch_app
Interaction
arrow_forward
error
Errors
arrow_forward
support_agent
Handoff
arrow_forward
verified
Patterns
-
Click play or press Space to begin...
Step- / 8
psychology
Designing for Calibrated Trust
The goal: users who trust AI where it’s strong and override it where it’s not
The Trust Spectrum
AI products face a unique UX challenge: the output is probabilistic. Sometimes it’s brilliant. Sometimes it’s confidently wrong. The UX must help users navigate this uncertainty.

Two failure modes of trust:

Zero trust (complete avoidance): Users don’t believe the AI, manually verify everything, and eventually stop using the product. The AI adds friction instead of removing it.

Blind trust (uncritical acceptance): Users accept every AI output without review. When the AI is wrong (and it will be), the consequences are undetected until they cause real damage.

The goal is calibrated trust: users develop an accurate mental model of what the AI is good at and where it struggles. They rely on it for routine tasks and apply extra scrutiny for edge cases.
Building Calibrated Trust
1. Set expectations early.
During onboarding, clearly communicate what the AI can and cannot do. “I can help with billing questions and order tracking. For account security issues, I’ll connect you with a specialist.”

2. Show your work.
Explain how the AI arrived at its answer. Source citations, reasoning steps, and confidence indicators help users evaluate outputs.

3. Make errors visible, not hidden.
When the AI is uncertain, say so. “I’m not confident about this answer” is better than a wrong answer delivered with false confidence.

4. Enable easy correction.
Make it trivial for users to flag wrong outputs, edit AI suggestions, and provide feedback. Every correction is training data for improvement.

5. Progressive disclosure.
Start with simple, high-confidence features. As users build trust, introduce more complex capabilities. Don’t overwhelm new users with the full AI feature set on day one.
The trust equation: Trust = (Reliability × Transparency) / (Risk of Failure). Increase reliability through better models. Increase transparency through UX. Decrease risk through guardrails and human oversight. The PM controls all three levers.
tune
The Confidence UI Pattern
Displaying uncertainty without undermining user confidence in the product
Three Questions Users Need Answered
Every AI output should help users quickly answer three questions:

1. How sure is the AI?
Is this a high-confidence answer based on strong evidence, or a best guess? Users need this signal to decide how much scrutiny to apply.

2. What is this based on?
What data, documents, or reasoning led to this output? Source attribution lets users verify the answer independently.

3. What should I do if it’s wrong?
Is there a way to correct it, get a different answer, or escalate to a human? Users need an escape hatch.
Confidence Indicators
Visual confidence signals:
Color coding: Green for high confidence, yellow for moderate, red for low
Language hedging: “Based on your order history, your package should arrive Tuesday” vs. “I’m not certain, but it may arrive Tuesday”
Source count: “Based on 3 matching documents” vs. “I couldn’t find a direct match”
Verification prompts: “Please verify this information before proceeding” for lower-confidence outputs
What NOT to Do
Don’t show raw percentages.
“87.3% confidence” means nothing to most users. Is 87% good? Bad? It depends on the task. Translate confidence into actionable language: “High confidence” or “You may want to double-check this.”

Don’t show confidence on everything.
If every answer has a confidence badge, users develop badge blindness. Reserve confidence indicators for cases where it changes user behavior — when the AI is less certain than usual or when the stakes are high.

Don’t undermine the product.
Excessive uncertainty language (“I might be wrong about this, but...” on every response) erodes trust in the entire product. Calibrate the uncertainty language to actual error rates.
The design principle: Confidence UI should change user behavior proportionally to actual risk. High-confidence, low-stakes outputs need no indicator. Low-confidence, high-stakes outputs need prominent warnings. The PM defines the confidence thresholds and the corresponding UX treatment for each.
touch_app
AI Interaction Patterns
Six patterns for the three phases of AI interaction: before, during, and after
Pre-Action Patterns
1. Intent Preview
Before the AI acts, show the user what it’s about to do and ask for confirmation. “I’ll send a refund of $49.99 to your original payment method. Proceed?”

Use for: Any action with real-world consequences (sending emails, making purchases, modifying data). Never auto-execute irreversible actions.

2. Autonomy Dial
Let users control how much autonomy the AI has. A spectrum from “suggest only” (AI proposes, human decides) to “act autonomously” (AI decides and acts, human reviews after).

Use for: Agentic AI products where the AI can take multiple actions. Let users start with low autonomy and increase it as they build trust. Power users may want full autonomy; new users want approval gates.
In-Action Patterns
3. Explainable Rationale
Show the AI’s reasoning as it works. “I found 3 relevant policies. The most recent one (updated March 2026) says...” This builds trust and helps users catch errors in the reasoning, not just the output.

4. Confidence Signal
Real-time indicators of certainty during the interaction. Streaming responses with source citations appearing as the AI generates. Progress indicators that show the AI is searching, analyzing, or generating.
Post-Action Patterns
5. Action Audit & Undo
After the AI acts, provide a clear log of what was done and a way to reverse it. “I updated 3 records. View changes. Undo all.” Every AI action should be reversible for at least a reasonable window.

6. Escalation Pathway
A clear, always-available path from AI to human help. Not buried in a menu — prominently accessible. “Talk to a person” should never be more than one click away.
The pattern selection rule: Higher stakes = more patterns. A low-stakes content suggestion needs only a confidence signal. A high-stakes financial action needs intent preview + explainable rationale + action audit + escalation pathway. Match the UX overhead to the risk level.
error
Designing for AI Failures
Three types of failures, three design responses
Type 1: Predictable Errors
What: The system recognizes its own uncertainty. The retrieval confidence is low, the input is ambiguous, or the query is outside the AI’s scope.

Design response:
• Show uncertainty explicitly: “I’m not confident about this answer.”
• Ask for clarification: “Did you mean X or Y?”
• Offer alternatives: “I couldn’t find an exact match, but here are related topics.”
• Suggest human help: “For this type of question, I’d recommend speaking with a specialist.”

These are the easiest to handle because the system knows it’s uncertain.
Type 2: Edge Case Failures
What: Unexpected scenarios the system wasn’t designed for. A user asks in a language the AI doesn’t support. An input format the system can’t parse. A query that requires real-time data the system doesn’t have.

Design response:
• Graceful degradation: Fall back to a simpler capability rather than failing completely
• Clear error messaging: Explain what went wrong in user-friendly language
• Human handoff: Automatically route to a human when the AI can’t handle the request
• Feedback capture: Log the failure for the team to analyze and fix
Type 3: Silent Failures (The Dangerous Ones)
What: The AI is confident but wrong. Hallucinations. Plausible-sounding but fabricated information. The system shows no uncertainty because it doesn’t know it’s wrong.

This is the hardest failure to design for because the system itself can’t detect it in real time.

Design response:
Source citations: Always show where the answer came from so users can verify
Feedback loops: Thumbs up/down, “flag as incorrect” buttons on every output
Verification prompts: For high-stakes outputs, add “Please verify this before acting on it”
Human-in-the-loop: For critical decisions, require human approval before the AI output is finalized
Confidence calibration: Continuously improve the model’s ability to know when it doesn’t know
The 23% problem: Research shows roughly 23% of AI interactions produce unsatisfactory outputs. Your UX must handle this gracefully. The difference between a good AI product and a bad one isn’t the 77% that works — it’s how the product handles the 23% that doesn’t.
support_agent
Human Handoff Design
The most important UX pattern in AI products — knowing when to stop being AI
When to Hand Off
The AI should escalate to a human when:

Confidence-based triggers:
• Retrieval confidence below threshold (no good documents found)
• Multiple conflicting answers with no clear winner
• User has asked the same question multiple times (indicating dissatisfaction)

Scope-based triggers:
• Request is outside the AI’s defined scope (legal advice, medical diagnosis)
• User explicitly asks for a human
• Sensitive topics (complaints, cancellations, safety concerns)

Behavior-based triggers:
• User sentiment turns negative (frustration, anger detected)
• Conversation exceeds a length threshold without resolution
• User provides feedback that the AI is wrong multiple times
How to Hand Off
1. Preserve context.
When the human agent takes over, they should see the full conversation history, the AI’s attempted answers, and any user feedback. Nothing is more frustrating than repeating yourself to a human after the AI failed.

2. Explain the handoff.
“I want to make sure you get the best help. I’m connecting you with a specialist who can assist with this.” Frame it as a service upgrade, not an AI failure.

3. Set expectations.
“A specialist will be with you in approximately 2 minutes.” Don’t leave users in limbo.

4. Make it seamless.
The transition should feel like a natural part of the experience, not a jarring context switch. Same interface, same conversation thread, clear indicator that a human is now responding.

5. Learn from handoffs.
Every handoff is a signal that the AI couldn’t handle something. Track handoff reasons, analyze patterns, and use them to improve the AI over time.
The handoff metric: Track your “containment rate” (% of interactions resolved without human handoff) and your “handoff satisfaction” (user satisfaction with the handoff experience). A low containment rate means the AI scope is too broad. Low handoff satisfaction means the transition is broken. Both are PM-owned metrics.
hourglass_top
Loading States & Latency
AI is slow. Here’s how to make it feel fast.
The Latency Problem
LLM responses take 2–15 seconds depending on model, prompt length, and output length. Traditional web apps respond in under 200ms. Users notice the difference.

Without proper loading UX, users think the product is broken, click away, or submit duplicate requests. Latency is the #1 UX complaint for AI products.
Streaming Responses
The most impactful pattern: stream the response token by token as the LLM generates it. Users see the answer forming in real time, which:

• Reduces perceived wait time dramatically (users start reading immediately)
• Gives a sense of progress (something is happening)
• Allows early abandonment (user can see the answer is off-track and interrupt)

Streaming is now table stakes for any chat-based AI product. If your AI product shows a spinner for 8 seconds and then dumps a wall of text, you’re losing users.
Beyond Streaming
Skeleton screens: Show the structure of the response before content loads. For structured outputs (tables, cards, lists), display the layout immediately and fill in content as it arrives.

Progress narration: For multi-step AI tasks, narrate the progress: “Searching knowledge base... Found 5 relevant articles... Generating answer...” This builds trust by showing the AI’s process.

Optimistic UI: For predictable actions, show the expected result immediately and update if the AI’s actual output differs. Works well for auto-complete and suggestions.

Background processing: For non-urgent AI tasks (document analysis, batch processing), process in the background and notify when complete. Don’t block the user’s workflow.

Cached responses: For common queries, serve cached responses instantly and refresh in the background if the cache is stale.
The latency budget: <1s = instant (auto-complete, suggestions). 1–3s = streaming (chat responses). 3–10s = progress narration (complex analysis). >10s = background processing with notification. Match the UX pattern to the expected latency. Never show a blank screen for more than 1 second.
feedback
Feedback Mechanisms
Every user interaction is training data — if you design the feedback loops right
Explicit Feedback
Thumbs up / thumbs down:
The simplest and most common pattern. Low friction, high volume. Provides a binary quality signal. Place it on every AI output. Expect 5–15% of users to provide feedback.

Corrections:
Let users edit the AI’s output directly. “The AI suggested X, but the correct answer is Y.” Higher quality signal than thumbs up/down but higher friction. Use for high-value outputs where accuracy matters.

Ratings:
1–5 star ratings or multi-dimension ratings (accuracy, helpfulness, tone). More nuanced than binary but lower completion rates. Use sparingly — after key interactions, not every message.

Free-text feedback:
“What was wrong with this response?” Richest signal but lowest volume. Use as an optional follow-up to negative feedback.
Implicit Feedback
Acceptance signals:
• User copies the AI’s response (positive)
• User accepts the AI’s suggestion (positive)
• User acts on the AI’s recommendation (positive)

Rejection signals:
• User regenerates the response (negative)
• User edits the response significantly (partially negative)
• User ignores the suggestion (negative)
• User escalates to human help (negative)

Engagement signals:
• Time spent reading the response (longer = more engaged or more confused)
• Follow-up questions (may indicate incomplete answer)
• Session length and return rate
Closing the Loop
Feedback is only valuable if it reaches the team and influences the product. Build a pipeline: collect feedback → aggregate → surface patterns → prioritize fixes → retrain/improve → measure impact. Show users their feedback matters: “Thanks to user feedback, we improved accuracy on billing questions by 15%.”
The feedback flywheel: More users → more feedback → better AI → more trust → more users. This is the core growth loop for AI products. The PM who designs effective feedback mechanisms accelerates this flywheel. The PM who ignores feedback lets the AI stagnate.
verified
The AI UX Checklist
A practical checklist for designing AI product experiences that build trust
Trust & Transparency
□ Expectations set during onboarding
Users know what the AI can and cannot do before they start.

□ Source citations on AI outputs
Users can verify where the answer came from.

□ Confidence indicators for uncertain outputs
Users know when to apply extra scrutiny.

□ “I don’t know” behavior defined
The AI admits uncertainty rather than hallucinating.
Control & Recovery
□ Intent preview for consequential actions
Users confirm before the AI takes irreversible actions.

□ Undo capability for AI actions
Users can reverse what the AI did.

□ Human escalation always accessible
“Talk to a person” is never more than one click away.

□ Edit and correct AI outputs
Users can fix wrong answers directly.
Performance & Feedback
□ Streaming responses for chat interactions
Users see the answer forming in real time.

□ Progress indicators for multi-step tasks
Users know the AI is working, not stuck.

□ Thumbs up/down on every AI output
Low-friction feedback mechanism always available.

□ Feedback pipeline to the team
User signals reach the people who can improve the AI.
Error Handling
□ Graceful degradation for edge cases
The product falls back to simpler capabilities rather than crashing.

□ Clear, human-friendly error messages
No technical jargon in error states.

□ Automatic handoff triggers defined
The system knows when to escalate to a human.

□ Error patterns tracked and prioritized
Recurring failures are systematically addressed.
The bottom line: AI UX is fundamentally about managing uncertainty. Traditional software is deterministic — the same input always produces the same output. AI products are probabilistic — the same input might produce different outputs of varying quality. Every pattern in this chapter exists to help users navigate that uncertainty productively. The PM who masters AI UX patterns ships products that users trust, use, and recommend.