Ch 14: The AI Landscape Today

Ch 14 — The AI Landscape Today

MoE internals, test-time compute, agent architectures, RAG pipelines, and MCP protocol

Index ← High Level

Under the Hood

Click play or press Space to begin the final deep dive...

Step- / 10

A · Model Architectures

B · Test-Time Compute

C · Agent Systems

D · RAG & Context

E · Inference & Deployment

A Modern Model Architectures

hub

Mixture of Experts (MoE)

Sparse routing, expert selection, DeepSeek/Llama 4/Mistral

arrow_downward

view_in_ar

Multimodal Architecture

Vision encoders, cross-attention, early vs late fusion

arrow_downward

B Test-Time Compute & Reasoning

psychology

Chain-of-Thought & Reasoning Tokens

How o1/o3/R1 think before answering

arrow_downward

speed

Compute-Optimal Inference

Scaling laws for inference, adaptive compute

arrow_downward

C Agent Architectures

smart_toy

ReAct & Agent Loops

Reason-Act-Observe pattern, tool calling, memory

arrow_downward

cable

MCP Protocol Internals

JSON-RPC, tools/resources/prompts, transport

arrow_downward

D RAG & Context Engineering

manage_search

RAG Pipeline Architecture

Embedding, chunking, retrieval, reranking, generation

arrow_downward

database

Vector Databases & Embeddings

ANN search, HNSW, cosine similarity, hybrid search

arrow_downward

E Inference & Deployment

memory

Inference Optimizations

KV cache, continuous batching, speculative decoding, quantization

arrow_downward

rocket_launch

The Full Picture

Connecting all 14 chapters — the complete AI stack

S10