Ch 7: Output Formatting & Structured Data — Prompt Engineering Mastery

Ch 7 — Output Formatting & Structured Data

JSON, XML, markdown, tables — getting reliable structured output every time

arrow_backIndex

Control

psychology

Why Format

arrow_forward

warning

Prose Trap

arrow_forward

data_object

JSON

arrow_forward

code

XML Tags

arrow_forward

Feedback ETL

arrow_forward

schema

JSON Schema

arrow_forward

bug_report

Pitfalls

arrow_forward

checklist

Toolkit

Click play or press Space to begin...

Step- / 8

psychology

Why Output Format Matters

LLMs generate text — but your code needs data structures, not paragraphs

The Problem

Every LLM is a text generator. Left to its own devices, it will respond with natural language paragraphs. That’s fine when a human reads the output. But when your code reads the output — to parse it, store it, or pipe it to another service — you need structured, predictable, parseable data.

This chapter teaches you how to reliably get JSON, XML, markdown tables, and other structured formats from any LLM.

The Spectrum of Control

Least control: “Summarize this document” → free-form prose

Some control: “Summarize in bullet points” → semi-structured

Full control: “Return a JSON object with these exact fields: ...” → structured data

You already know the building blocks (Ch 2: Format + Constraints) and the power of examples (Ch 3). This chapter applies those techniques specifically to structured output.

Key insight: The more downstream systems depend on the output format, the more explicit your format specification needs to be. Human readers tolerate format variation; JSON.parse() does not.

warning

The Prose Trap: When You Ask for Data but Get a Story

A real scenario — extracting invoice data for an accounting pipeline

The Scenario

Your accounting team receives PDF invoices. You use an LLM to extract key fields and feed them into QuickBooks via API. The API expects JSON with specific fields.

Superficial Prompt

Extract the key information from this invoice: [invoice text: Acme Corp, Invoice #4521, Date: 2025-03-15, 50 units Widget Pro at $24.99 each, 10% discount applied, Tax 8.5%, Total: $1,223.04]

Model Output (Prose)

This invoice is from Acme Corp, numbered 4521, dated March 15, 2025. It contains an order for 50 Widget Pro units at $24.99 each. A 10% discount was applied, bringing the subtotal to $1,124.55. With 8.5% tax ($95.59), the total comes to $1,223.04.

Why This Breaks Your Pipeline

1. No structure: It’s a paragraph. You can’t JSON.parse() this.

2. Inconsistent naming: “Acme Corp” vs “vendor_name”, “numbered 4521” vs “invoice_number”

3. Reformatted dates: The original was “2025-03-15” (ISO 8601). The model converted it to “March 15, 2025”. Your API expects the ISO format.

4. Calculated fields: The model helpfully calculated the subtotal and tax, but you didn’t ask for those — and now you’d need to verify them.

The pattern: Without explicit format instructions, the model defaults to the most “helpful” format it knows: a natural language summary. It even reformats your data to be more readable for humans — which is exactly the opposite of what your code needs.

data_object

The Fix: Explicit JSON with Schema

Show the model the exact shape of the JSON you want — it will fill in the values

Deliberate Prompt

Extract data from this invoice as JSON. Return ONLY the JSON object, no explanation, no markdown code fences. Use this exact schema: { "vendor_name": string, "invoice_number": string, "date": string (ISO 8601: YYYY-MM-DD), "line_items": [ { "description": string, "quantity": integer, "unit_price": number } ], "discount_percent": number or null, "tax_percent": number, "total": number } Invoice: [same invoice text as before]

Model Output (Structured)

{ "vendor_name": "Acme Corp", "invoice_number": "4521", "date": "2025-03-15", "line_items": [ { "description": "Widget Pro", "quantity": 50, "unit_price": 24.99 } ], "discount_percent": 10, "tax_percent": 8.5, "total": 1223.04 }

What Made This Work

1. “Return ONLY the JSON”: Prevents the model from adding explanations
2. Exact field names: The model uses your names, not its own
3. Type annotations: integer, number, string (ISO 8601) prevent format drift
4. “No markdown code fences”: Prevents ```json ... ``` wrapping that breaks JSON.parse()

Key insight: The schema in your prompt acts as a template. The model “fills in the blanks” with extracted values. The more precise your schema, the more predictable the output. Include types, date formats, and null handling.

code

XML Tags as Delimiters

When you need multiple output sections, XML tags are more reliable than markdown headers

The Technique

XML-style tags are excellent delimiters because: (1) they’re unambiguous — <analysis> clearly marks where analysis starts, (2) they nest naturally, and (3) they’re easy to parse with regex or an XML parser. Anthropic’s Claude documentation specifically recommends XML tags for structured output.

Prompt with XML Delimiters

Analyze this pull request. Structure your response using these exact XML tags: <summary> One-paragraph overview of the changes </summary> <issues> Bullet list of problems found </issues> <suggestions> Specific code improvements with line refs </suggestions> <verdict> APPROVE, REQUEST_CHANGES, or COMMENT </verdict> Pull request diff: [paste diff here]

Model Output

<summary> Adds rate limiting to the /api/search endpoint using Redis. Implements a sliding window counter with 100 req/min per IP. </summary> <issues> - Redis connection has no error handling; if Redis is down, all requests fail - Rate limit key uses raw IP; behind a load balancer this will be the LB's IP - No unit tests for the rate limiter </issues> <suggestions> - Line 23: Wrap Redis calls in try/except, fall back to allowing the request - Line 15: Use X-Forwarded-For header with fallback to remote_addr - Add tests for: under limit, at limit, over limit, Redis down scenarios </suggestions> <verdict>REQUEST_CHANGES</verdict>

Parsing in Python

import re verdict = re.search( r'<verdict>(.+?)</verdict>', response, re.DOTALL ).group(1).strip() # → "REQUEST_CHANGES"

Key insight: XML tags give you the best of both worlds: the model can write natural language inside each tag, but you get reliable, parseable boundaries between sections. Use JSON when you need structured data; use XML tags when you need structured prose.

Domain Example: Customer Feedback ETL

Processing a batch of feedback emails into structured data for your analytics dashboard

The Scenario

Your product team receives 200+ feedback emails weekly. You need to extract structured data from each one and load it into a PostgreSQL table for your analytics dashboard. The table schema has: sentiment, topic, urgency, suggested_action.

Production Prompt

Extract structured feedback data from this customer email. Return ONLY valid JSON. Schema: { "sentiment": "positive"|"negative"|"neutral", "topic": "bug"|"feature_request"| "ux"|"performance"|"billing"|"other", "urgency": "low"|"medium"|"high", "summary": string (max 50 words), "suggested_action": "fix_bug"| "add_to_roadmap"|"improve_docs"| "escalate"|"no_action" } Rules: - Use ONLY the enum values listed above - If unsure about a field, use the most conservative option - urgency=high only if the user mentions data loss, security, or blocking issue Email: "The new dashboard is gorgeous, love the dark mode! But the export to CSV button has been broken for 3 days now and our weekly client report is due Friday. We really need this fixed ASAP."

Model Output

{ "sentiment": "negative", "topic": "bug", "urgency": "high", "summary": "CSV export broken for 3 days, blocking weekly client report due Friday. User likes new dashboard and dark mode.", "suggested_action": "fix_bug" }

Why This Works at Scale

1. Enum constraints: Every field has a fixed set of valid values. Your code can validate with a simple if value in allowed_values check.

2. Decision rules in the prompt: “urgency=high only if...” prevents over-escalation. Without this rule, the model would mark everything “ASAP” as high.

3. Summary length cap: “max 50 words” keeps the summary concise and consistent across all emails.

4. Conservative default: “If unsure, use the most conservative option” prevents false positives in your triage system.

Key insight: For ETL pipelines, your prompt is essentially a data transformation specification. Treat it like a database schema: define every field, constrain every value, and specify the edge-case behavior. The more constraints, the more reliable the extraction.

schema

Advanced: JSON Schema in the Prompt

For maximum reliability, give the model a formal JSON Schema — or use the API’s structured output mode

JSON Schema Approach

Return data matching this JSON Schema: { "type": "object", "required": ["name", "email", "plan"], "properties": { "name": {"type": "string"}, "email": { "type": "string", "format": "email" }, "plan": { "type": "string", "enum": ["free","pro","enterprise"] }, "issues": { "type": "array", "items": {"type": "string"} } } }

API-Level Structured Output

Modern LLM APIs offer structured output modes that guarantee valid JSON:

OpenAI: response_format: { type: "json_schema", json_schema: {...} } — the model is constrained at the token-generation level to only produce valid JSON matching your schema.

Anthropic: Use tool definitions with input schemas — Claude returns structured data as tool call arguments.

These API features are more reliable than prompt-only approaches because they constrain the output at the decoding level, not just the instruction level.

OpenAI Structured Output Example

response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}], response_format={ "type": "json_schema", "json_schema": { "name": "feedback", "schema": your_schema } } )

Key insight: When available, always prefer API-level structured output over prompt-only approaches. Prompt instructions are “best effort” — the model tries to follow them. API-level constraints are guaranteed — the model physically cannot produce invalid JSON.

bug_report

Common Formatting Pitfalls

Mistakes that break your parser — and how to prevent them

Pitfall 1: Markdown Code Fences

❌ Model wraps JSON in code fences: ```json {"name": "Acme Corp", "total": 1223.04} ``` Your JSON.parse() fails on the backticks. Fix: Add "Return ONLY the raw JSON. Do not wrap in markdown code fences."

Pitfall 2: Trailing Commas

❌ Model adds trailing comma: { "name": "Acme", "total": 1223.04, ← invalid JSON } Fix: Use API structured output mode, or post-process with a lenient parser like Python's json5 library.

Pitfall 3: Preamble Text

❌ Model adds explanation before JSON: Here is the extracted data: {"name": "Acme", "total": 1223.04} Fix: "Your response must start with { and end with }. No other text."

Pitfall 4: Hallucinated Fields

The model adds fields you didn’t ask for (e.g., "confidence": 0.95). This can break strict schema validation.

Fix: “Return ONLY the fields specified in the schema. Do not add any additional fields.”

Defensive Parsing Pattern

import json, re def safe_parse(text): # Strip markdown fences text = re.sub(r'```json?\n?', '', text) text = re.sub(r'```', '', text) # Find first { to last } start = text.index('{') end = text.rindex('}') + 1 return json.loads(text[start:end])

Key insight: Always write defensive parsers. Even with the best prompt, LLMs occasionally add preamble text, code fences, or trailing commas. Your code should handle these gracefully rather than crashing.

checklist

The Output Formatting Toolkit

Which format for which use case — a quick reference

Format Selection Guide

Structured data for APIs/databases → JSON with explicit schema Best for: extraction, classification, ETL pipelines Multi-section prose output → XML tags as delimiters Best for: code reviews, analysis, reports with distinct sections Human-readable structured output → Markdown with headers Best for: documentation, summaries, content that humans will read Tabular data → CSV or markdown tables Best for: comparisons, lists, data that goes into spreadsheets Maximum reliability → API structured output mode Best for: production pipelines where parsing failures are unacceptable

The Formatting Checklist

□ Specify exact format JSON schema, XML tags, or example □ Constrain field values Enums, types, date formats, max lengths □ Suppress extras "Return ONLY...", "No explanation", "No markdown code fences" □ Handle edge cases "If field is missing, use null" "If unsure, use conservative default" □ Defensive parsing Strip fences, find JSON boundaries, validate against schema □ Consider API-level constraints response_format, tool definitions

Key insight: Output formatting is where prompt engineering meets software engineering. Your prompt is a data contract between the LLM and your code. The more explicit the contract, the fewer bugs in production. Combine explicit schemas, “return ONLY” constraints, and defensive parsing for maximum reliability.

arrow_back Ch 6: System Prompts Ch 8: Prompt Patterns arrow_forward