Ch 5: Advanced Reasoning Patterns — Prompt Engineering Mastery

Ch 5 — Advanced Reasoning Patterns

Tree-of-thought, decomposition, self-reflection, and verification — beyond linear chain-of-thought

arrow_backIndex

Reasoning

account_tree

Beyond CoT

arrow_forward

call_split

Decomposition

arrow_forward

security

Security Audit

arrow_forward

park

Tree of Thought

arrow_forward

storage

DB Migration

arrow_forward

rate_review

Self-Reflection

arrow_forward

verified

The Verifier

arrow_forward

map

Pattern Map

Click play or press Space to begin...

Step- / 8

account_tree

Beyond Linear Chain-of-Thought

CoT is a straight line — real problems need branching, backtracking, and verification

The Limitation of Basic CoT

In Ch 4, you learned that “think step by step” dramatically improves accuracy. But basic CoT has a limitation: it’s linear. The model generates one reasoning path from start to finish. If it takes a wrong turn at Step 3, every subsequent step builds on that error. There’s no backtracking, no exploring alternatives, no checking its own work.

Three Advanced Patterns

1. Decomposition: Break a big problem into smaller sub-problems, solve each independently, then combine. Like dividing a project into tasks.

2. Tree-of-Thought (ToT): Explore multiple reasoning paths in parallel, evaluate each, and pick the best. Like brainstorming 3 approaches before committing.

3. Self-Reflection: After generating an answer, ask the model to critique its own work and improve it. Like a code review on your own code.

The Analogy

Think about how a senior engineer solves a complex production incident:

They don’t just follow one thread from start to finish. They decompose (“is this a database issue, a network issue, or an application issue?”), explore branches (“let me check the DB logs AND the network metrics simultaneously”), and verify (“I think it’s the connection pool, but let me confirm by checking the thread count”).

Advanced reasoning patterns teach the model to think like that senior engineer instead of a junior who follows one path and hopes for the best.

Key insight: Basic CoT gives the model a scratch pad. Advanced reasoning patterns give it a whiteboard — with space to branch, backtrack, and verify. The harder the problem, the more you need these patterns.

call_split

Decomposition: Break Big Problems into Small Ones

The model can’t solve a 10-step problem in one shot — but it can solve ten 1-step problems

The Technique

Instead of asking one big question, break it into a sequence of smaller, focused questions. Each sub-answer feeds into the next. The model handles each piece with full attention instead of trying to juggle everything at once.

Single Prompt (Superficial)

Prompt: “Review this Python Flask API code for security issues.”

Output:
1. You should add input validation
2. Consider using HTTPS
3. Add authentication

Generic advice. Didn’t actually analyze the code.

Decomposed Prompt (Deliberate)

Analyze this Flask API for security vulnerabilities. Work through these steps in order: Step 1: List every user input point (request params, headers, body fields, URL path variables, file uploads). Step 2: For each input point, trace how the data flows through the code. Does it get sanitized? Validated? Escaped? Step 3: For each unprotected flow, identify the specific vulnerability type (SQLi, XSS, SSRF, path traversal, etc.) and the exact line of code. Step 4: Rate each vulnerability as Critical / High / Medium / Low based on exploitability and impact. Step 5: For each vulnerability, provide the exact code fix. [paste code here]

Key insight: Decomposition works because each step constrains the model’s focus. In Step 1, it only thinks about inputs. In Step 2, it only traces data flow. The model doesn’t have to hold the entire problem in its “working memory” at once — it processes one piece at a time, just like a human analyst would.

security

Decomposition in Action: Security Audit

The decomposed prompt finds 6 real issues where the single prompt found 3 generic ones

Sample Code Under Review

@app.route('/api/search') def search(): query = request.args.get('q') sql = f"SELECT * FROM products WHERE name LIKE '%{query}%'" results = db.execute(sql) return render_template('results.html', query=query, results=results) @app.route('/api/upload', methods=['POST']) def upload(): f = request.files['document'] path = os.path.join('/uploads', f.filename) f.save(path) return jsonify({'path': path})

Decomposed Output (Steps 1-4 Summary)

Step 1 — Input Points Found: 1. request.args.get('q') → search query 2. request.files['document'] → file upload 3. f.filename → user-controlled filename Step 2 — Data Flow Tracing: • 'q' → f-string → raw SQL → db.execute ⚠ NO sanitization • 'q' → render_template → HTML output ⚠ Depends on template escaping • f.filename → os.path.join → f.save ⚠ NO filename sanitization Step 3 — Vulnerabilities: 1. SQL Injection (line 4): f-string SQL 2. Reflected XSS (line 6): if template uses |safe or {!! !!} 3. Path Traversal (line 11): filename could be "../../etc/passwd" 4. Unrestricted Upload: no file type validation, no size limit 5. Path Disclosure (line 12): returns server path in response 6. DoS via Upload: no rate limiting Step 4 — Severity: SQL Injection: CRITICAL Path Traversal: CRITICAL XSS: HIGH Unrestricted Upload: HIGH Path Disclosure: MEDIUM DoS: MEDIUM

Key insight: The single prompt found 3 generic issues. The decomposed prompt found 6 specific, line-referenced vulnerabilities with severity ratings. Same model, same code — the only difference is how you structured the question.

park

Tree-of-Thought: Explore Multiple Paths

Instead of one reasoning chain, generate 3 approaches and pick the best

The Technique

Tree-of-Thought (Yao et al., 2023) asks the model to brainstorm multiple approaches, evaluate each one, and then commit to the best. It’s like a senior engineer saying “let me think of 3 ways to solve this before I start coding.”

The Prompt Pattern

I need to [describe the problem]. Step 1: Brainstorm 3 different approaches to solve this. For each approach, briefly describe the strategy and list its pros and cons. Step 2: Evaluate the 3 approaches against these criteria: [your criteria — e.g., performance, maintainability, time to implement]. Step 3: Select the best approach and explain why. Step 4: Implement the selected approach in detail.

Example: Caching Strategy for an API

Problem: Our /api/products endpoint takes 2.3s to respond. It queries 3 tables and does price calculations. 80% of requests are for the same 100 products. We need sub-200ms response times. Model brainstorms 3 approaches: A) Redis cache with TTL Pros: Simple, fast reads (~1ms) Cons: Cache invalidation complexity, extra infrastructure B) Materialized view + DB-level cache Pros: No extra infra, consistent Cons: Still ~50ms, refresh lag C) CDN edge caching with stale-while- revalidate Pros: Globally fast, offloads server Cons: Stale data risk, complex purging Evaluation: For 80% repeated products, Redis (A) gives the best latency with manageable complexity. CDN (C) is overkill for an internal API. Materialized view (B) doesn't hit the 200ms target. Selected: Approach A (Redis cache)

Key insight: Without ToT, the model would jump to the first solution that comes to mind (usually the most common one). ToT forces it to consider alternatives, which often surfaces a better approach. It’s especially valuable for architecture and design decisions where the first idea isn’t always the best.

storage

Domain Example: Planning a Database Migration

Single prompt gives a vague checklist — decomposed prompt gives a production-ready plan

Single Prompt

Prompt: “We need to migrate our users table to split the address into separate columns. Create a migration plan.”

Output:
1. Create new columns
2. Migrate data
3. Update application code
4. Drop old column
5. Test everything

A checklist anyone could write. No specifics, no safety considerations.

Decomposed Prompt

Plan a zero-downtime migration to split the `address` text column in our `users` table (PostgreSQL 15, 2.3M rows) into street, city, state, zip columns. Work through these phases: Phase 1 — Schema Analysis: Examine the current schema. What data formats exist in the address column? What are the edge cases? Phase 2 — Breaking Changes: What queries, views, indexes, and application code reference the address column? What will break? Phase 3 — Migration Steps: Write the exact SQL for each step. Must be zero-downtime (no table locks on 2.3M rows). Phase 4 — Rollback Plan: If something goes wrong at any step, how do we revert? Write the rollback SQL for each migration step. Phase 5 — Validation: How do we verify the migration was correct? What queries confirm data integrity?

Key insight: The decomposed prompt produces a plan you could actually execute in production. Phase 3 gives you real SQL (ALTER TABLE ... ADD COLUMN, UPDATE with batching to avoid locks). Phase 4 gives you rollback scripts. The single prompt gives you a to-do list you’d have to flesh out yourself.

rate_review

Self-Reflection: Make the Model Review Its Own Work

Generate → Critique → Improve — the three-pass pattern

The Technique

Self-reflection asks the model to evaluate its own output and improve it. This works because the model is often better at judging quality than producing quality on the first try. It’s the same reason code reviews catch bugs the author missed — a fresh perspective (even from the same model) finds issues.

The Three-Pass Pattern

# Pass 1: Generate "Write a technical blog post explaining how database connection pooling works, targeted at junior backend developers." # Pass 2: Critique "Now review your blog post critically. Check for: - Technical accuracy - Missing important concepts - Unclear explanations - Assumptions about reader knowledge List every issue you find." # Pass 3: Improve "Now rewrite the blog post, fixing every issue from your critique."

What the Critique Typically Finds

In practice, the model’s self-critique consistently catches:

1. Missing context: “I assumed the reader knows what a TCP connection is but didn’t explain it”
2. Vague claims: “I said pooling ‘improves performance’ without quantifying how much”
3. Structural issues: “I jumped from basic concept to advanced configuration without a transition”
4. Missing examples: “I described the concept but didn’t show a code example of creating a pool”

First Draft (Pass 1)

Covers the basics but uses jargon without defining it, makes vague performance claims, and jumps between concepts without clear structure. Typical first-draft quality.

After Self-Reflection (Pass 3)

Defines terms before using them, includes specific numbers (“reduces connection overhead from ~50ms to ~1ms”), adds a code example, and flows logically from problem → solution → implementation.

Key insight: Self-reflection is like getting a free code review. The rewrite is almost always significantly better than the first draft. Use it whenever quality matters more than speed — blog posts, documentation, important emails, architecture proposals.

verified

The Verifier: “Now Check Your Work”

A lighter version of self-reflection — one line that catches errors before they ship

The Pattern

The Verifier is a single line you append to any CoT prompt:

“Now review your answer. Are there any errors, missing steps, or incorrect assumptions? If so, correct them.”

It’s lighter than full self-reflection (one pass instead of three) but catches the most common errors: arithmetic mistakes, logical inconsistencies, and forgotten edge cases.

Example: Infrastructure Cost Estimate

Prompt: Estimate the monthly AWS cost for: - 3 t3.large EC2 instances (on-demand) - 500GB gp3 EBS storage - 1 RDS db.r5.large PostgreSQL instance - 100GB data transfer out Think step by step, then verify your calculations. Model output (with verification): ...calculates each line item... Total: $847.23/month Verification: ✓ EC2: 3 × $0.0832/hr × 730 hrs = $182.21 ✓ EBS: 500 × $0.08/GB = $40.00 ✓ RDS: $0.24/hr × 730 = $175.20 ✗ Data transfer: I used $0.09/GB but the first 10TB is $0.09, not all of it. Correcting: 100 × $0.09 = $9.00 ✓ Corrected total: $406.41/month

When to Use the Verifier

Always use for:
• Calculations and estimates (catches arithmetic errors)
• Multi-step plans (catches missing steps)
• Code generation (catches syntax errors and edge cases)
• Factual claims (catches hallucinations)

Skip for:
• Creative writing (verification kills creativity)
• Simple Q&A (overhead not worth it)
• High-volume pipelines (adds latency)

Combining Patterns

These patterns stack beautifully:

Decomposition + Verifier: Break the problem into steps, solve each, then verify the whole thing.

ToT + Verifier: Brainstorm 3 approaches, pick the best, implement it, then verify the implementation.

Few-shot CoT + Self-reflection: Show a reasoning example, let the model reason, then have it critique and improve.

Key insight: The Verifier is the highest-ROI single line you can add to any prompt. It costs ~20 extra tokens of output but catches errors that would cost you hours of debugging. Make it a habit: any time the model does math, planning, or code generation, end with “now verify your answer.”

map

The Pattern Selection Map

Which reasoning pattern for which problem? A decision framework

Decision Framework

# What kind of problem am I solving? Multi-step with clear sequence → Basic CoT (Ch 4) "Think step by step" Big problem with distinct sub-parts → Decomposition "Work through these phases: ..." Design decision with multiple options → Tree-of-Thought "Brainstorm 3 approaches, evaluate, pick" Quality-critical output → Self-Reflection (3-pass) "Generate → Critique → Rewrite" Any reasoning task → Add the Verifier "Now check your work" High-stakes, must be correct → Self-Consistency (Ch 4) + Verifier "5 runs, majority vote, each verified"

Real-World Combinations

Code review: Decomposition (list inputs → trace flows → find vulnerabilities → rate severity → suggest fixes)

Architecture design: ToT (3 approaches → evaluate → select) + Verifier

Technical writing: Self-reflection (draft → critique → rewrite)

Migration planning: Decomposition (schema → breaking changes → migration SQL → rollback → validation)

Cost estimation: CoT + Verifier (calculate step by step, then verify each line item)

The Effort vs Impact Trade-off

Low effort, high impact: Verifier (1 line)
Medium effort, high impact: Decomposition (structured steps)
Medium effort, medium impact: ToT (for design decisions)
High effort, high impact: Self-reflection (3 passes, 3x tokens)

Start with the Verifier on everything. Add decomposition for complex tasks. Use ToT for decisions. Reserve self-reflection for quality-critical output.

Key insight: You now have a complete reasoning toolkit: basic CoT (Ch 4) for simple multi-step problems, decomposition for complex ones, ToT for decisions, self-reflection for quality, and the Verifier for everything. The best prompt engineers don’t just use one technique — they combine them based on the problem.

arrow_back Ch 4: Chain-of-Thought Ch 6: System Prompts & Personas arrow_forward