Ch 1: What Is Context Engineering?

history

The Prompt Engineering Era

2022–mid 2025: crafting the perfect instruction

What It Was

Prompt engineering was the practice of manually crafting specific text instructions for individual LLM interactions. Techniques like chain-of-thought, few-shot examples, and role-based system prompts dominated the field from 2022 through early 2025. The focus was entirely on how you phrase the question to the model.

The Limitation

In production systems, the user prompt is a tiny fraction of what the model actually sees. 80–90% of the context window is filled by retrieved documents, conversation history, tool definitions, and system instructions. Optimizing only the prompt is like tuning the radio while ignoring the engine.

Key insight: Prompt engineering addresses only one of eight components that enter an LLM’s context window. The other seven — system prompt, history, RAG docs, tool schemas, few-shot examples, memory, and metadata — are where production quality is won or lost.

campaign

The Naming Moment

Mid-2025: Karpathy and Lütke reframe the discipline

Andrej Karpathy

In mid-2025, former Tesla and OpenAI researcher Andrej Karpathy publicly described context engineering as “the delicate art and science of filling the context window with just the right information for the next step.” He argued that the real skill isn’t writing prompts — it’s curating the entire information environment the model receives.

Tobi Lütke

Shopify CEO Tobi Lütke independently endorsed the same shift, calling context engineering a “core skill” for anyone building AI products. The convergence of these two voices — one from deep research, one from enterprise product leadership — signaled that the industry was moving beyond prompt craft.

Key insight: When both the research community and the business community independently arrive at the same conclusion, it usually signals a genuine paradigm shift rather than a passing trend.

menu_book

Foundational Publications

Manus and Anthropic lay the groundwork

Manus (July 2025)

Manus published lessons from rebuilding their agent framework four times. Key findings: don’t dynamically add or remove tools mid-iteration (it invalidates the KV-cache), keep recent tool calls in raw format to preserve the model’s “rhythm,” and never compress away error traces.

Anthropic (September 2025)

Anthropic followed with their guide on effective context engineering for agents. Their core principle: find “the smallest possible set of high-signal tokens that maximise the likelihood of desired outcomes.” The guide covered system instructions, tool definitions, MCP resources, retrieved documents, and conversation history.

Why it matters: These two publications became the de facto reference material for the field. The patterns they described — progressive disclosure, compression, routing — were adopted across platforms within months.

definition

The Formal Definition

What context engineering actually means

Definition

Context engineering is the practice of deciding what information an AI model sees, when it sees it, and how it is structured — at runtime. It covers everything that enters the context window: system instructions, user prompts, conversation history, retrieved documents, tool definitions, few-shot examples, memory stores, and metadata.

The connection: Prompt engineering tells the model how to talk. Context engineering controls what it sees when it talks. The distinction matters because performance gains in 2026 come from dynamic context selection, compression, and memory management — not from clever prompt wording.

Prompt vs Context

Prompt Engineering

Manual, one-off instruction craft. Focuses on the user prompt. Static per interaction. Doesn’t scale to multi-turn agent systems.

Context Engineering

Automated, systematic infrastructure. Manages all 8 context components. Dynamic at runtime. Designed for production agent pipelines.

psychology

Why LLMs Need Context Engineering

Finite attention budgets and the lost-in-the-middle problem

Finite Attention

LLMs have a finite attention budget. Every token in the context window competes for attention. As context grows, precision drops, reasoning weakens, and the model starts missing information it should catch. Research calls this the “lost in the middle” problem — models show U-shaped performance curves, favoring content at the beginning and end while struggling with information in the middle.

Real-World Degradation

Despite advertised context windows of 128K to 2M+ tokens, real-world performance degrades 30–40% before hitting the technical limit. Systematic context management can prevent 30% of this information loss. The paradox: more context often means worse answers, because irrelevant tokens dilute the model’s attention on what actually matters.

Critical in AI: A refund policy question that dumps 50 pages of documents from 2018 to 2026 into the context will confuse the model with contradictory policies. Adding more documents makes the response worse, not better. This is a context problem, not a prompt problem.

Ch 1 — What Is Context Engineering?