Common Failure Modes
Architectural drift: The model introduces patterns inconsistent with the codebase’s architecture.
Dependency violations: The model imports from layers it shouldn’t access.
Style inconsistency: The model uses different naming conventions, formatting, or patterns than the existing code.
Incomplete implementation: The model implements the happy path but skips error handling, edge cases, or tests.
The Underlying Problem
Models are trained on the entire internet’s code. They know many ways to solve a problem, but they don’t know your way. Without constraints, they default to the most common patterns from their training data, which may not match your codebase’s conventions. The harness bridges this gap by encoding your specific standards.
Critical in AI: These failures are subtle. The code compiles, passes basic tests, and looks reasonable. But it introduces technical debt, breaks architectural invariants, and creates maintenance burden. Harnesses catch what tests and compilers miss.