Ch 5: Large-Scale Migrations & Refactoring — Autonomous Software Pipelines

Ch 5 — Large-Scale Migrations & Refactoring

The highest-ROI use case for autonomous agents

Index

High Level

Analyze

arrow_forward

map

Plan

arrow_forward

account_tree

Decompose

arrow_forward

code

Execute

arrow_forward

verified

Verify

arrow_forward

merge

Merge

Click play or press Space to begin...

Step- / 8

swap_horiz

Why Migrations Are the Killer Use Case

Repetitive, well-defined, and nobody wants to do them

The Perfect Agent Task

Large-scale migrations are the highest-ROI application of autonomous coding agents. Why? They’re repetitive (the same pattern applied hundreds of times), well-defined (clear before/after states), verifiable (tests and type-checkers confirm correctness), and nobody wants to do them. A framework upgrade that touches 500 files is exactly the kind of work that sits in the backlog for months because no human wants to spend a week on mechanical changes.

Common Migration Types

Framework upgrades — React class components to hooks, Angular version bumps, Rails major versions. API deprecations — replacing deprecated method calls across a codebase. Security remediations — updating vulnerable dependencies and fixing breaking changes. Language migrations — JavaScript to TypeScript, Python 2 to 3. Monorepo restructuring — moving code between packages with import path updates.

Key insight: The best migration tasks for agents share three properties: the transformation pattern is consistent, the correctness is mechanically verifiable, and the task can be decomposed into independent file-level changes.

Phase 1: Analyze

Understanding the scope before touching any code

What the Agent Does

Before writing a single line of code, the agent maps the entire migration surface. It scans the codebase for every instance of the pattern that needs to change — every deprecated API call, every class component, every usage of the old import path. It classifies each instance: straightforward (mechanical transformation), complex (requires understanding context), or needs human judgment (ambiguous intent, multiple valid approaches).

The Analysis Report

The output is a migration plan: 487 files need changes. 412 are straightforward transformations. 63 require context-aware changes. 12 need human review. This report is the first human review checkpoint. The developer validates the scope, confirms the classification, and approves the plan before any code is modified. This prevents the agent from making changes to files it shouldn’t touch.

Key insight: The analysis phase is where you catch scope errors. If the agent misidentifies which files need changes, every subsequent step is wasted. Invest time reviewing the analysis report carefully.

map

Phase 2: Plan & Decompose

Breaking a 500-file migration into manageable PRs

Decomposition Strategy

A single PR that changes 500 files is unreviewable. The agent decomposes the migration into logical batches: by module, by directory, by dependency order, or by complexity level. Each batch becomes a separate PR that can be reviewed and merged independently. The key constraint: each batch must leave the codebase in a working state. You should be able to merge batch 1 without batch 2 and have everything still build and pass tests.

Dependency Ordering

Some migrations have ordering constraints. If module A depends on module B, and both need changes, B must be migrated first. The agent builds a dependency graph and plans the batch order accordingly. This is where the hybrid approach shines — the dependency analysis is deterministic (AST parsing), while the actual code transformation is handled by the LLM.

Key insight: The decomposition strategy determines the review experience. Small, focused PRs (10–30 files) with clear descriptions are reviewable. Giant PRs are not. The agent should optimize for reviewability, not just correctness.

code

Phase 3: Execute

Parallel agents transforming code at scale

Parallel Execution

Once the plan is approved, multiple agents execute in parallel — each handling one batch in its own isolated environment (git worktree or container). Agent 1 migrates the auth module, Agent 2 migrates the user module, Agent 3 migrates the billing module — all simultaneously. Each agent runs the relevant tests after making changes, iterates on failures, and opens a PR when its batch is complete.

The Hybrid Approach

The most effective migration tools combine AST-based codemods with LLM intelligence. Deterministic transformations (renaming imports, updating method signatures) use AST manipulation — fast, reliable, no hallucination risk. Complex transformations (restructuring logic, handling edge cases) use the LLM. This hybrid approach, used by tools like Codemod 2.0, gets the reliability of traditional codemods with the flexibility of AI for the hard cases.

Key insight: Pure LLM-based migration is slow and error-prone for mechanical changes. Pure AST-based codemods can’t handle complex cases. The hybrid approach uses each tool where it’s strongest.

groups

Multi-Agent Refactoring

Specialized agents with different roles working together

The Three-Agent Pattern

Research on multi-agent refactoring (Jan 2026) identifies three specialized roles: (1) Scope Inference Agent — transforms the developer’s intent (“rename UserService to AccountService everywhere”) into an explicit refactoring scope (every file, import, reference, test, and documentation mention). (2) Planned Execution Agent — identifies all program elements needing changes and executes transformations via trusted APIs. (3) Replication Agent — handles project-wide search and ensures consistency.

Why Specialization Works

A single generalist agent trying to handle scope analysis, code transformation, and consistency checking tends to lose context and make errors. Specialized agents are more reliable because each one has a narrower task with clearer success criteria. The scope agent doesn’t need to write code; the execution agent doesn’t need to understand intent. This separation of concerns mirrors how human teams handle large refactors.

Key insight: Multi-agent refactoring isn’t about having more agents — it’s about having the right specializations. Three focused agents outperform one generalist agent on complex refactoring tasks.

verified

Verification & Quality Gates

How to know the migration actually worked

The Verification Stack

Every migration batch must pass a verification stack before it’s ready for human review: (1) Type checking — does the code compile? Are there new type errors? (2) Test suite — do all existing tests pass? (3) Lint/format — does the code follow project conventions? (4) Behavioral equivalence — for refactors that shouldn’t change behavior, do the outputs match? (5) Migration-specific checks — are there any remaining instances of the old pattern?

The Completeness Check

The most important migration-specific check: are there any remaining instances of the old pattern? If you’re migrating from deprecated API v1 to v2, the agent should verify that zero calls to v1 remain after all batches are merged. This is a simple grep-level check, but it’s the one that catches the files the agent missed or the edge cases it couldn’t handle.

Key insight: The verification stack is what makes agent-driven migrations trustworthy. Without it, you’re just hoping the agent got everything right. With it, you have mechanical proof that the migration is complete and correct.

warning

When Migrations Go Wrong

Common failure modes and how to prevent them

Failure Mode 1: Scope Creep

The agent starts migrating files that weren’t in the plan, or makes “improvements” beyond the migration scope. Prevention: strict file lists in the plan, and a verification step that checks the agent only modified files in its assigned batch.

Failure Mode 2: Semantic Drift

The agent changes the code structure correctly but subtly alters behavior. A function that used to return null now returns undefined. Tests pass because they don’t check for this distinction. Prevention: behavioral equivalence tests and mutation testing on the migrated code.

Failure Mode 3: Merge Conflicts

Parallel agents modify files that overlap, creating merge conflicts when PRs are combined. Prevention: the decomposition phase must ensure batches have non-overlapping file sets. When overlap is unavoidable, those files go into a sequential batch.

Key insight: Most migration failures happen in the planning phase, not the execution phase. A well-planned migration with clear scope, non-overlapping batches, and strong verification gates rarely fails. A poorly planned one fails regardless of how good the agent is.

rocket_launch

Your First Agent-Driven Migration

Start small, learn the pattern, then scale

Pick the Right First Migration

Choose something with these properties: mechanical transformation (not judgment-heavy), good test coverage (you’ll know if the agent breaks something), limited scope (50–100 files, not 5,000), and low business risk (not the payment processing module). Dependency version bumps and deprecated API replacements are ideal first migrations.

The Learning Outcome

Your first agent-driven migration teaches you: how to write effective migration specs, how long the analyze-plan-execute cycle takes, what your review process looks like for agent-generated PRs, and where the agent struggles with your specific codebase. This knowledge is more valuable than the migration itself — it’s the foundation for scaling to larger migrations.

Key insight: The three-phase pattern (analyze, plan, execute) works for any migration size. Start with a small one to learn the pattern, then apply it to the 500-file migration that’s been sitting in your backlog for six months.

arrow_back Ch 4: AI-Driven Testing Ch 6: Orchestration at Scale arrow_forward