Ch 3 — Progressive Disclosure & Agent Skills

Loading information in tiers so agents see only what they need, when they need it
High Level
error
Problem
arrow_forward
filter_list
Tiers
arrow_forward
description
Skills
arrow_forward
person
Identity
arrow_forward
auto_fix_high
Self-Author
arrow_forward
balance
Tradeoffs
-
Click play or press Space to begin...
Step- / 8
error
The Multi-Domain Problem
Why loading everything upfront fails
The Scenario
Consider a customer support agent that handles billing, refunds, onboarding, and technical support. It needs instructions for all four domains. Loading all instructions upfront wastes most of the context window on irrelevant guidance — a billing question doesn’t need the onboarding playbook.
The Traditional Alternative
The old approach was spinning up separate specialized sub-agents for each domain. But this adds orchestration complexity, duplicates shared logic, and introduces latency from inter-agent communication. Neither the “load everything” nor the “separate agents” approach scales well as the number of domains grows.
Critical in AI: A support agent with 50+ instruction sets loaded upfront will have most of its context window consumed by irrelevant guidance, leaving little room for the actual conversation and retrieved documents that matter for the current query.
filter_list
Three-Tier Loading
Discovery, activation, execution
Tier 1 — Discovery
At startup, the agent loads only names and descriptions of available capabilities. This costs approximately ~80 tokens per skill (median). For Anthropic’s 17 built-in skills, the total discovery cost is only ~1,700 tokens — a fraction of the context window.
Tier 2 — Activation
When the model determines a skill is relevant to the current task, the full instruction body loads (275 to 8,000 tokens depending on complexity). This is a file read, not an LLM call, so latency is minimal.
Tier 3 — Execution
Supporting scripts, reference materials, and detailed examples load only during active task execution. These are the heaviest assets and are evicted when the task completes.
Key insight: This three-tier approach mirrors how human experts work: you know what you’re capable of (discovery), you pull up the relevant playbook when needed (activation), and you consult reference materials only during execution. The agent never carries the full weight of all its capabilities at once.
description
Agent Skills Format
Markdown files with YAML frontmatter
The Standard
An Agent Skill is a markdown file with YAML frontmatter. The format was released by Anthropic in December 2025 and adopted by OpenAI, Google, GitHub, and Cursor within weeks. The frontmatter contains the name and description (used for Tier 1 discovery), while the body contains the full instructions (loaded at Tier 2 activation).
Skill File Structure
--- name: refund-processing description: Handle customer refund requests including policy lookup, eligibility checks, and payment reversal. --- # Refund Processing Skill ## Steps 1. Verify customer identity 2. Check refund eligibility window 3. Look up original transaction 4. Apply refund policy rules 5. Process or escalate
3-Tier Architecture (2026 Best Practice)
Modern implementations use a three-tier file hierarchy:

Tier 1: Root CLAUDE.md — Universal rules under 100 lines, loaded every session. Contains coding standards, naming conventions, and project-wide constraints.

Tier 2: Skills (.claude/skills/*/SKILL.md) — Task-specific playbooks loaded on demand for commits, PRs, migrations, etc.

Tier 3: Agent Guides (docs/agent-guides/) — Deep reference material the agent consults only when needed for complex decisions.
Key insight: Because skills are plain English markdown, domain experts and team leads can configure agent behavior directly — without engineering expertise. This democratizes agent customization beyond the development team.
person
Agent Identity Management
One agent, many identities
The Pattern
Rather than routing queries to separate specialized sub-agents, a single agent assumes different identities on demand. At rest, it has a base identity. When a task activates a skill, the agent adopts that skill’s instructions, constraints, tone, and behavioral patterns. When the task completes, it returns to base.
Claude Code Example
This is exactly what Claude Code already does. It does not spin up a separate “PDF agent” and a “spreadsheet agent.” It is one agent that activates the relevant skill, shifting its identity to match. The context window carries only the active skill’s instructions, not the instructions for every possible task.
Beyond Coding Agents
Agent Skills are not only for coding agents. The pattern generalizes to customer support, internal operations, research agents, and any system where agents need broad capability with focused execution. A single customer support agent can handle billing, refunds, onboarding, and technical support by activating the relevant skill for each query.
Why it matters: Identity management eliminates the orchestration overhead of multi-agent systems while preserving the context efficiency of specialized agents. One agent with skills is simpler, cheaper, and often more effective than a fleet of specialized sub-agents.
auto_fix_high
Self-Authoring Skills
Agents that write their own playbooks
The Concept
An interesting extension of Agent Skills: agents that write their own skills. When an agent encounters a task it handles repeatedly, it can extract the pattern into a new skill file. Claude Code supports this through its skill-creator skill — the agent observes its own successful behavior, generalizes it, and makes it available for future sessions.
The Feedback Loop
This closes a powerful loop: humans write the initial skills, agents extend the library from experience. Over time, the agent’s skill library grows organically based on actual usage patterns. The quality of self-authored skills varies, but the direction is clear — agents that learn to document their own expertise.
Key insight: Self-authoring skills represent a form of agent memory that persists across sessions. Unlike conversation history (which is ephemeral), a self-authored skill becomes permanent infrastructure that benefits all future interactions.
speed
Token Economics of Skills
Measuring the cost advantage
Discovery Cost
// Token cost comparison Without progressive disclosure: 17 skills × ~2,000 tokens avg = 34,000 tokens // Loaded upfront, every session With progressive disclosure: 17 skills × ~80 tokens (name+desc) = 1,360 tokens + 1 active skill = ~2,000 tokens Total: ~3,360 tokens // 10× reduction in base context cost
Latency Profile
Discovery data is pre-loaded at session start — no latency impact. Activation adds a file read, not an LLM call, so the latency cost is negligible (typically under 10ms). The only meaningful latency is when the model decides which skill to activate, which happens as part of its normal reasoning — no extra inference call required.
Key insight: Progressive disclosure converts a fixed upfront cost (all skills loaded) into a variable cost (only active skills loaded). For agents with many capabilities but focused tasks, this is the single highest-leverage context optimization available.
warning
Tradeoffs and Failure Modes
Where progressive disclosure breaks down
Skill Selection Errors
The entire approach depends on selection accuracy at the discovery layer. If the model picks the wrong skill based on a brief description, the error compounds downstream — the agent follows the wrong playbook for the entire task. With a small skill set (under 20), accuracy is high. With 100+ skills, overlapping descriptions cause misactivation.
The Deactivation Problem
The key unsolved question: when does an activated skill get deactivated? Without explicit pruning logic, multiple activated skills accumulate during a session, destroying the token advantage over time. If a user asks about billing, then refunds, then onboarding, all three skill bodies may remain loaded.
Governance at Scale
Maintaining 50+ skills with non-overlapping descriptions requires governance. Who writes skills? Who reviews them? How do you prevent two teams from creating overlapping skills? How do you deprecate skills that are no longer relevant? These are organizational challenges, not technical ones.
Rule of thumb: Progressive disclosure works best with 10–30 well-defined skills. Below 10, just load everything. Above 50, you need a dedicated skill governance process and likely hierarchical categorization to help the model navigate the options.
rocket_launch
Rapid Industry Adoption
From Anthropic’s release to universal standard in weeks
The Adoption Timeline
December 2025: Anthropic releases the Agent Skills format for Claude Code.

Within weeks: OpenAI (Codex), Google (Jules), GitHub (Copilot), and Anysphere (Cursor) all adopt compatible formats. The convergence was remarkably fast — the format was simple enough (markdown + YAML) that adoption required no new infrastructure.

By March 2026: Agent Skills are the de facto standard for agent instruction management across the industry.
Why It Spread So Fast
Three factors drove rapid adoption: (1) The format is just markdown — no proprietary schema, no SDK dependency. (2) It solves a universal problem — every agent system struggles with context bloat from instructions. (3) It’s human-readable — domain experts can write and review skills without engineering support.
Key insight: The Agent Skills format succeeded because it’s the simplest possible solution to the progressive disclosure problem. A markdown file with YAML frontmatter is the lowest-friction format that still provides structured metadata for discovery. Simplicity drove adoption speed.