Ch 8: Risks, Governance & The Future

Ch 8 — Risks, Governance & The Future

Agent sprawl, vendor lock-in, security, and building rippable harnesses

Index

High Level

visibility_off

Sprawl

arrow_forward

lock

Lock-In

arrow_forward

shield

Security

arrow_forward

content_cut

Rippable

arrow_forward

work

Careers

arrow_forward

rocket_launch

Future

Click play or press Space to begin...

Step- / 8

visibility_off

Agent Sprawl

73% of enterprise agents are unmonitored “shadow agents”

The Problem

Agent sprawl occurs when individual developers or teams deploy AI agents without central oversight. Research indicates that 73% of enterprise AI agents are unmonitored “shadow agents” — running without governance, logging, or security review. They have access to codebases, APIs, and sometimes production systems without anyone tracking what they do.

The Risks

Security: Shadow agents may have overly broad permissions, creating attack surfaces.

Compliance: Untracked agents may violate data handling policies or regulatory requirements.

Cost: Unmonitored agents can run up significant API bills through doom loops or inefficient usage.

Quality: Without harness engineering, shadow agents produce inconsistent, unreviewed code.

Critical in AI: Agent sprawl is the AI equivalent of shadow IT. Just as organizations learned to govern cloud usage, they must now govern agent usage. An agent registry — a central catalog of all deployed agents with their permissions, harnesses, and owners — is the first step.

lock

Vendor Lock-In

Lock-in at the workflow level, not the model level

The New Lock-In

Traditional vendor lock-in was about the model: if you built on GPT-4, switching to Claude was hard. The new lock-in is at the workflow level: if your entire harness is built around Claude Code’s CLAUDE.md format, Cursor’s AGENTS.md conventions, or OpenAI’s Codex API, switching platforms means rebuilding the harness.

Mitigation

Platform-agnostic constraints: Write constraints in generic markdown, generate platform-specific files.

Abstraction layers: Build your orchestration on abstractions that can target different agent platforms.

Portable skills: Structure skill files so they can be adapted to any platform’s format.

Multi-model testing: Regularly test your harness with different models to ensure it’s not model-dependent.

Key insight: The harness is your moat, but it shouldn’t be your cage. Design harnesses that are portable across platforms. The constraint logic should be separable from the platform-specific integration.

shield

Security Surface Area

Every agent capability is an attack vector

Attack Vectors

Prompt injection via CLAUDE.md: If an attacker can modify the constraint document, they control the agent’s behavior.

Tool abuse: Agents with shell access can execute arbitrary commands. Agents with API access can exfiltrate data.

Supply chain: MCP servers from untrusted sources can inject malicious instructions through tool outputs.

Privilege escalation: Agents may be given broader permissions than needed for their specific task.

Security Principles

Least privilege: Give agents only the permissions they need for the current task.

Sandboxing: Run agents in isolated environments with limited file system and network access.

Audit logging: Log every tool invocation, file modification, and API call.

Constraint integrity: Protect constraint documents with the same rigor as production config. Code review all changes to CLAUDE.md.

Key insight: CLAUDE.md is a security-critical file. It controls agent behavior the same way a config file controls a server. Treat it with the same security rigor: version control, code review, access control, and change auditing.

content_cut

Building Rippable Harnesses

Designing for replaceability

The Principle

A rippable harness is one that can be replaced or significantly modified without rewriting the entire system. The constraint logic is separated from the platform integration. The review pipeline is modular. The orchestration layer has clean interfaces. If a better platform emerges tomorrow, you can swap it in without starting over.

Design Guidelines

Separate concerns: Constraint content, platform integration, and orchestration logic in different layers.

Standard formats: Use markdown for constraints, JSON for schemas, YAML for configuration.

Clean interfaces: Each harness component communicates through well-defined interfaces, not tight coupling.

Version everything: Constraint documents, skill files, and pipeline configs should be versioned and diffable.

Key insight: The AI platform landscape is changing rapidly. A harness built for today’s tools should be adaptable to tomorrow’s. Rippability is not just good engineering — it’s a survival strategy in a fast-moving field.

work

Career Implications

The shelf life of “managing agents” as a career

The New Role

Harness engineering is creating a new career path: engineers who specialize in designing systems that make AI agents productive. This requires deep software engineering knowledge (to understand what good code looks like), systems thinking (to design constraint and review architectures), and AI literacy (to understand agent failure modes).

The Question

How long will this role exist? If agents get good enough to not need harnesses, the role disappears. If agents always need guidance (the more likely scenario based on current trends), harness engineering becomes a permanent discipline — like DevOps, which emerged when deployment became complex enough to need its own specialists.

Key insight: The safest bet is that harness engineering is here to stay. Just as we still need DevOps despite better deployment tools, we’ll still need harness engineers despite better models. The complexity shifts but doesn’t disappear.

gavel

Governance Frameworks

Organizational policies for agent usage

What to Govern

Agent registry: Central catalog of all deployed agents, their permissions, and owners.

Approval process: New agents require security review before deployment.

Permission tiers: Read-only, write-to-branch, merge-capable, production-access — each requiring increasing approval.

Cost budgets: Per-agent and per-team spending limits on API calls.

Audit requirements: Logging and review requirements based on agent permission level.

Lightweight Governance

Governance doesn’t have to be heavy. Start with: (1) A shared spreadsheet of all agents and their owners. (2) A Slack channel for agent-related incidents. (3) A monthly review of agent costs and error rates. (4) A simple approval process for new agents. Scale governance as agent usage scales.

Key insight: Governance should enable, not block. The goal is to make agent usage safe and efficient, not to prevent it. Teams that make governance too heavy will see developers bypass it with shadow agents — making the problem worse.

auto_fix_high

AutoHarness & Self-Synthesis

Agents that design their own harnesses

The Trend

Google DeepMind’s AutoHarness paper demonstrated that models can automatically generate effective harnesses for coding tasks. The agent analyzes the task, generates appropriate constraints and verification steps, and uses them to improve its own output. This suggests a future where harness generation is partially automated.

What It Means

AutoHarness doesn’t eliminate the need for harness engineers — it changes what they do. Instead of writing every constraint manually, engineers design the meta-harness: the system that generates and validates task-specific harnesses. The human role shifts from writing constraints to designing the constraint generation system.

Key insight: AutoHarness is to harness engineering what compilers were to assembly language. It automates the mechanical parts while humans focus on the architectural decisions. The discipline evolves but doesn’t disappear.

rocket_launch

The Future of the Discipline

Where harness engineering is heading

Near-Term (2026–2027)

Standardization: Expect convergence on constraint document formats, review pipeline patterns, and orchestration interfaces. The current fragmentation (CLAUDE.md vs AGENTS.md vs Codex rules) will consolidate.

Tooling: Dedicated harness engineering tools will emerge — constraint editors, compliance dashboards, entropy monitors, and harness testing frameworks.

Education: Harness engineering will become a formal topic in software engineering curricula and professional development.

The Bigger Picture

Harness engineering is part of a broader shift: software engineering is becoming systems engineering. Engineers increasingly design systems that produce code rather than writing code directly. The skills that matter are shifting from implementation to architecture, from coding to constraint design, from debugging to observability. Harness engineering is the discipline that makes this transition work.

Key insight: Harness engineering is not a temporary workaround for imperfect AI. It’s the permanent discipline of making AI agents reliable in production. The patterns in this course — constraints, review, memory, orchestration, governance — will evolve in form but persist in function. Master them now.

arrow_back Ch 7: Orchestration at Scale Back to Index arrow_forward