Ch 8 — Risks, Governance & The Future

Agent sprawl, vendor lock-in, security, and building rippable harnesses
High Level
visibility_off
Sprawl
arrow_forward
lock
Lock-In
arrow_forward
shield
Security
arrow_forward
content_cut
Rippable
arrow_forward
work
Careers
arrow_forward
rocket_launch
Future
-
Click play or press Space to begin...
Step- / 8
visibility_off
Agent Sprawl
73% of enterprise agents are unmonitored “shadow agents”
The Problem
Agent sprawl occurs when individual developers or teams deploy AI agents without central oversight. Research indicates that 73% of enterprise AI agents are unmonitored “shadow agents” — running without governance, logging, or security review. They have access to codebases, APIs, and sometimes production systems without anyone tracking what they do.
The Risks
Security: Shadow agents may have overly broad permissions, creating attack surfaces.

Compliance: Untracked agents may violate data handling policies or regulatory requirements.

Cost: Unmonitored agents can run up significant API bills through doom loops or inefficient usage.

Quality: Without harness engineering, shadow agents produce inconsistent, unreviewed code.
Critical in AI: Agent sprawl is the AI equivalent of shadow IT. Just as organizations learned to govern cloud usage, they must now govern agent usage. An agent registry — a central catalog of all deployed agents with their permissions, harnesses, and owners — is the first step.
lock
Vendor Lock-In
Lock-in at the workflow level, not the model level
The New Lock-In
Traditional vendor lock-in was about the model: if you built on GPT-4, switching to Claude was hard. The new lock-in is at the workflow level: if your entire harness is built around Claude Code’s CLAUDE.md format, Cursor’s AGENTS.md conventions, or OpenAI’s Codex API, switching platforms means rebuilding the harness.
Mitigation
Platform-agnostic constraints: Write constraints in generic markdown, generate platform-specific files.

Abstraction layers: Build your orchestration on abstractions that can target different agent platforms.

Portable skills: Structure skill files so they can be adapted to any platform’s format.

Multi-model testing: Regularly test your harness with different models to ensure it’s not model-dependent.
Key insight: The harness is your moat, but it shouldn’t be your cage. Design harnesses that are portable across platforms. The constraint logic should be separable from the platform-specific integration.
shield
Security Surface Area
Every agent capability is an attack vector
Attack Vectors
Prompt injection via CLAUDE.md: If an attacker can modify the constraint document, they control the agent’s behavior.

Tool abuse: Agents with shell access can execute arbitrary commands. Agents with API access can exfiltrate data.

Supply chain: MCP servers from untrusted sources can inject malicious instructions through tool outputs.

Privilege escalation: Agents may be given broader permissions than needed for their specific task.
Security Principles
Least privilege: Give agents only the permissions they need for the current task.

Sandboxing: Run agents in isolated environments with limited file system and network access.

Audit logging: Log every tool invocation, file modification, and API call.

Constraint integrity: Protect constraint documents with the same rigor as production config. Code review all changes to CLAUDE.md.
Key insight: CLAUDE.md is a security-critical file. It controls agent behavior the same way a config file controls a server. Treat it with the same security rigor: version control, code review, access control, and change auditing.
content_cut
Building Rippable Harnesses
Designing for replaceability
The Principle
A rippable harness is one that can be replaced or significantly modified without rewriting the entire system. The constraint logic is separated from the platform integration. The review pipeline is modular. The orchestration layer has clean interfaces. If a better platform emerges tomorrow, you can swap it in without starting over.
Design Guidelines
Separate concerns: Constraint content, platform integration, and orchestration logic in different layers.

Standard formats: Use markdown for constraints, JSON for schemas, YAML for configuration.

Clean interfaces: Each harness component communicates through well-defined interfaces, not tight coupling.

Version everything: Constraint documents, skill files, and pipeline configs should be versioned and diffable.
Key insight: The AI platform landscape is changing rapidly. A harness built for today’s tools should be adaptable to tomorrow’s. Rippability is not just good engineering — it’s a survival strategy in a fast-moving field.
work
Career Implications
The shelf life of “managing agents” as a career
The New Role
Harness engineering is creating a new career path: engineers who specialize in designing systems that make AI agents productive. This requires deep software engineering knowledge (to understand what good code looks like), systems thinking (to design constraint and review architectures), and AI literacy (to understand agent failure modes).
The Question
How long will this role exist? If agents get good enough to not need harnesses, the role disappears. If agents always need guidance (the more likely scenario based on current trends), harness engineering becomes a permanent discipline — like DevOps, which emerged when deployment became complex enough to need its own specialists.
Key insight: The safest bet is that harness engineering is here to stay. Just as we still need DevOps despite better deployment tools, we’ll still need harness engineers despite better models. The complexity shifts but doesn’t disappear.
gavel
Governance Frameworks
Organizational policies for agent usage
What to Govern
Agent registry: Central catalog of all deployed agents, their permissions, and owners.

Approval process: New agents require security review before deployment.

Permission tiers: Read-only, write-to-branch, merge-capable, production-access — each requiring increasing approval.

Cost budgets: Per-agent and per-team spending limits on API calls.

Audit requirements: Logging and review requirements based on agent permission level.
Lightweight Governance
Governance doesn’t have to be heavy. Start with: (1) A shared spreadsheet of all agents and their owners. (2) A Slack channel for agent-related incidents. (3) A monthly review of agent costs and error rates. (4) A simple approval process for new agents. Scale governance as agent usage scales.
Key insight: Governance should enable, not block. The goal is to make agent usage safe and efficient, not to prevent it. Teams that make governance too heavy will see developers bypass it with shadow agents — making the problem worse.
auto_fix_high
AutoHarness & Self-Synthesis
Agents that design their own harnesses
The Trend
Google DeepMind’s AutoHarness paper demonstrated that models can automatically generate effective harnesses for coding tasks. The agent analyzes the task, generates appropriate constraints and verification steps, and uses them to improve its own output. This suggests a future where harness generation is partially automated.
What It Means
AutoHarness doesn’t eliminate the need for harness engineers — it changes what they do. Instead of writing every constraint manually, engineers design the meta-harness: the system that generates and validates task-specific harnesses. The human role shifts from writing constraints to designing the constraint generation system.
Key insight: AutoHarness is to harness engineering what compilers were to assembly language. It automates the mechanical parts while humans focus on the architectural decisions. The discipline evolves but doesn’t disappear.
rocket_launch
The Future of the Discipline
Where harness engineering is heading
Near-Term (2026–2027)
Standardization: Expect convergence on constraint document formats, review pipeline patterns, and orchestration interfaces. The current fragmentation (CLAUDE.md vs AGENTS.md vs Codex rules) will consolidate.

Tooling: Dedicated harness engineering tools will emerge — constraint editors, compliance dashboards, entropy monitors, and harness testing frameworks.

Education: Harness engineering will become a formal topic in software engineering curricula and professional development.
The Bigger Picture
Harness engineering is part of a broader shift: software engineering is becoming systems engineering. Engineers increasingly design systems that produce code rather than writing code directly. The skills that matter are shifting from implementation to architecture, from coding to constraint design, from debugging to observability. Harness engineering is the discipline that makes this transition work.
Key insight: Harness engineering is not a temporary workaround for imperfect AI. It’s the permanent discipline of making AI agents reliable in production. The patterns in this course — constraints, review, memory, orchestration, governance — will evolve in form but persist in function. Master them now.