Ch 4: The AI Coding Landscape

category

Four Categories of AI Coding Tools

Not all tools work the same way — understanding the categories

1. Assistants (IDE Plugins)

Run inside your existing editor as extensions. They provide inline completions (ghost text), chat panels, and quick fixes. Low friction — install and go. Best for small edits and daily coding. Limited on complex multi-file work because they operate within the editor’s constraints.

2. Agentic IDEs

Full editors rebuilt around AI. They fork or replace your IDE, adding deep agent integration: multi-file editing, codebase-wide search, terminal control, and autonomous task execution. More powerful than plugins, but require switching editors. Best for heavy refactoring and feature implementation.

3. CLI / Terminal Agents

Run in your terminal with no IDE dependency. They read/write files, execute commands, and iterate through natural language conversation. Composable with Unix toolchains. Best for developers who live in the terminal and want maximum autonomy without IDE lock-in.

4. Cloud / Autonomous Agents

Run in isolated cloud environments. They receive a task, spin up a sandboxed workspace, and work autonomously — committing code, running tests, and creating PRs. You review the output, not the process. Best for background tasks, bug fixes, and delegated work.

Key insight: These categories aren’t mutually exclusive. Many developers use an assistant for daily completions, an agentic IDE for feature work, and a CLI agent for terminal-heavy tasks. The best setup is often a combination.

edit_note

IDE-Native Tools

Plugins and AI-first editors that live where you code

The Plugin Approach

Tools like GitHub Copilot run as extensions in VS Code, JetBrains, and other editors. They add AI capabilities without changing your editor. The advantage: zero learning curve, works with your existing setup. The limitation: constrained by the editor’s extension API — can’t deeply control the editing experience.

The Fork Approach

Tools like Cursor and Windsurf fork VS Code and rebuild the editing experience around AI. They can do things plugins can’t: custom diff views, multi-file inline edits, agent-controlled terminal, and speculative decoding for faster completions. The tradeoff: you switch editors, and some VS Code extensions may not work.

Key Capabilities

// What IDE-native tools provide: Tab completion Ghost text as you type Chat panel Ask questions, get code Inline edit Select code, describe change Multi-file edit Agent edits across files Terminal AI runs commands for you Codebase search Semantic search over project Rules/config Persistent AI instructions // Pricing range: $10-20/month // Best for: daily coding workflow

Key insight: The plugin vs. fork debate is really about depth vs. compatibility. Plugins work everywhere but are shallow. Forks are deep but lock you in. As AI becomes more central to coding, the fork approach is winning — developers are willing to switch editors for better AI.

terminal

CLI & Terminal Agents

AI that lives in your terminal, not your editor

Why the Terminal?

CLI agents operate where code actually runs — the terminal. They can read files, write files, execute shell commands, run tests, and iterate — all through natural language conversation. No GUI overhead, no editor lock-in. They compose naturally with git, make, docker, npm, and every other CLI tool.

How They Work

You start a session, describe what you want, and the agent works autonomously: reading your codebase, making changes, running tests, fixing failures, and committing when done. Most support a permission system — you approve file edits and command execution, or run in auto-approve mode for trusted tasks.

The CLI Advantage

Token efficiency is a major differentiator. CLI agents send only the relevant code context, not the entire editor state. Some CLI agents are reported to be 5.5x more token-efficient than IDE agents for complex refactoring tasks, translating to lower costs and faster responses.

Open-Source Options

The CLI space has strong open-source representation. Tools like Aider (git-native, multi-model) are fully free and open-source, supporting any LLM provider. This means you can run AI coding with your own API keys, choosing between Claude, GPT, Gemini, or local models — no vendor lock-in.

Key insight: CLI agents are the fastest-growing category because they match how senior developers already work — in the terminal, with git, running commands. If you’re comfortable in the terminal, CLI agents often feel more natural than IDE-based AI.

cloud

Cloud & Autonomous Agents

AI that works in the background while you do other things

The Autonomous Model

Cloud agents receive a task (a GitHub issue, a Jira ticket, a natural language description) and work entirely independently. They spin up isolated environments, clone your repo, make changes, run tests, and create pull requests. You review the PR — you don’t supervise the process.

When They Shine

Cloud agents excel at well-defined, bounded tasks: fixing a bug with a clear reproduction, adding a test for an uncovered function, updating documentation, or migrating a dependency. They struggle with ambiguous requirements, architectural decisions, and tasks that need human judgment mid-process.

The Cost Spectrum

Cloud agents vary wildly in price. Some are usage-based (pay per task/token), while fully autonomous agents like Devin charge $500/month. The economics work when the agent handles tasks that would take a developer hours — but the cost adds up quickly for tasks it gets wrong and needs human correction.

Rule of thumb: Cloud agents are best for tasks you’d assign to a junior developer with clear instructions. If you’d need to pair-program with a junior to get it right, the cloud agent will likely need multiple iterations too — and each iteration costs tokens.

lock_open

Open Source & Self-Hosted

Running AI coding on your own infrastructure

Why Self-Host?

Some organizations can’t send code to external APIs — regulated industries, defense contractors, companies with proprietary algorithms. Self-hosted tools run the AI model on your own servers, keeping all code within your network. No data leaves your infrastructure.

Key Open-Source Tools

Tabby (33K+ GitHub stars) is a self-hosted coding assistant that runs on consumer GPUs with no external dependencies. Continue.dev is an open-source IDE extension supporting any model provider. Aider is a CLI agent supporting any LLM. All three let you bring your own model — including local models via Ollama.

The Tradeoff

Self-hosted tools give you full control and privacy, but the models you can run locally (7B–70B parameters) are significantly less capable than frontier cloud models (hundreds of billions of parameters). The gap is closing — but in 2026, cloud models still produce notably better code for complex tasks.

The connection: The open-source ecosystem connects directly to Ch 3 (Training Code Models). Models like StarCoder2, DeepSeek-Coder, and CodeLlama are open-weight, meaning you can download and run them locally. The training pipeline is transparent, so you know exactly what data your model learned from.

balance

How to Evaluate: The Decision Framework

Matching tools to your workflow, not chasing benchmarks

Step 1: Know Your Workflow Style

Inline developer? You prefer suggestions while typing, accepting/rejecting with Tab. → IDE tools with strong completion.

Conversation developer? You describe changes in natural language and review diffs. → Chat-focused tools.

Delegation developer? You describe goals and let AI figure out implementation. → Autonomous agents (CLI or cloud).

Step 2: Match to Project Type

Greenfield projects → Autonomous agents (scaffolding)
Bug fixes / maintenance → IDE tools (quick, targeted)
Large refactoring → Agentic IDE or CLI agent (multi-file)
Learning new codebases → Tools with strong context/search

Step 3: Check Practical Constraints

• Team standardization — Does it integrate with your CI/CD and code review?
• Security & compliance — SOC 2, data retention, network controls?
• Cost model — Fixed subscription vs. usage-based? Calculate at your team size.
• Model flexibility — Locked to one LLM, or can you choose?
• Context window — How much of your codebase can it see at once?

Critical in AI: Don’t choose based on benchmarks alone. HumanEval scores don’t predict performance on your codebase with your conventions. The only reliable evaluation is testing the tool on your actual project for a week.

monitoring

Measuring What Matters

Real-world metrics that actually predict value

Acceptance Rate

The percentage of AI suggestions you accept. A healthy range is 60–80% for inline completions. Below 50% means the tool creates more review work than it saves. Track this over a week of real usage — not a demo session.

Time to First Useful Output

How quickly does the tool produce something you can actually use? For completions, this should be under 200ms. For agent tasks, measure how long before you have a reviewable diff. If you spend more time prompting and waiting than you’d spend coding, the tool isn’t helping.

Iteration Count

For agent tasks, count how many back-and-forth cycles it takes to get correct output. 1–2 iterations is excellent. 3–4 is acceptable. 5+ means you should have written it yourself. This metric reveals whether the tool truly understands your codebase or is guessing.

Defect Introduction Rate

Track bugs introduced by AI-generated code that make it past review. If AI code has a higher defect rate than your team’s manual code, you need better review practices or a different tool. The goal: AI code should be at least as reliable as human code.

Why it matters: These four metrics — acceptance rate, time to output, iteration count, defect rate — tell you more about a tool’s real value than any benchmark. Measure them on your actual codebase, with your actual team.

check_circle

The Landscape Is Moving Fast

What stays constant in a rapidly changing market

What Changes Every 6 Months

Specific tools rise and fall. New models launch monthly. Pricing changes. Features get copied across tools within weeks. The tool you choose today may not be the best choice in six months. This is why this course is tool-agnostic — the concepts outlast any specific product.

What Stays Constant

• Context engineering — feeding AI the right information (Ch 7)
• Prompt structure — describing intent precisely (Ch 8)
• Review discipline — validating AI output (Ch 13)
• Security awareness — catching AI-introduced vulnerabilities (Ch 12)
• The agent loop — plan/execute/verify cycle (Ch 6)

These skills transfer across every tool in every category.

The Multi-Tool Reality

Most productive developers in 2026 use 2–3 tools in combination:

• An IDE tool for daily completions and quick edits
• A CLI or agentic tool for complex multi-file tasks
• A code review tool for automated PR analysis

The skill isn’t mastering one tool — it’s knowing which tool to reach for based on the task.

Key insight: The best AI coding tool is the one that fits your workflow, your team’s constraints, and your project’s needs. Test on your real codebase. Measure real metrics. And be ready to switch when something better comes along — because it will.

Ch 4 — The AI Coding Landscape