Ch 2: Your First LLM Chain

Ch 2 — Your First Chain Under the Hood

What actually happens when you call chain.invoke() — every object, every transformation

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step - / 10

A The Input Dictionary What .invoke() receives

data_object

Input Dict {"topic":
"quantum computing"}

passed to

link

RunnableSequence .invoke() starts
the pipeline

step 1

edit_note

prompt.invoke() First Runnable
in the sequence

arrow_downward Prompt formats dict into Message objects

B Inside ChatPromptTemplate.invoke() dict → ChatPromptValue

find_replace

Variable Sub {topic} →
"quantum computing"

creates

person

SystemMessage "You are a
helpful assistant."

chat

HumanMessage "Explain quantum
computing in 1 sentence"

wraps

inventory_2

ChatPromptValue Wrapper holding
the message list

arrow_downward ChatPromptValue flows into model.invoke()

C Inside ChatOpenAI.invoke() ChatPromptValue → HTTP request → AIMessage

transform

Serialize Messages → OpenAI
API JSON format

POST

cloud

HTTP Request POST /v1/chat/
completions

response

download

Parse Response JSON → AIMessage
with metadata

arrow_downward The AIMessage object in detail

D The AIMessage Object What the model actually returns

text_fields

.content The response
text string

info

.response_metadata Model name, tokens,
finish reason

build

.tool_calls Empty list (no
tools in this chain)

tag

.id Unique run ID
for tracing

arrow_downward AIMessage flows into parser.invoke()

E Inside StrOutputParser.invoke() AIMessage → plain string

psychology

AIMessage Full object with
content + metadata

.content

text_fields

Extract String return
message.content

check_circle

Plain String "Quantum computing
uses qubits..."

arrow_downward Now let's see how .stream() works differently

F Streaming: .stream() vs .invoke() Token-by-token output

play_arrow

prompt.invoke() Runs fully
(not streamable)

all at once

stream

model.stream() Yields AIMessage
Chunks one by one

chunks

output

parser.stream() Passes each
chunk through

arrow_downward Callbacks fire at every step for tracing

G Callback Events What fires during chain execution

play_circle

on_chain_start Fires when
chain begins

then

smart_toy

on_llm_start Fires when model
call begins

then

stream

on_llm_new_token Fires per token
(if streaming)

finally

stop_circle

on_chain_end Fires with
final output