Ch 10: RAG Solutions Landscape

Ch 10 — RAG Solutions Landscape — Under the Hood

Internals of frameworks, vector DBs, platforms, and integration patterns

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

ALangChain Architecture InternalsLCEL, Runnables, and the component model

input

Input

Dict or string

Runnable

link

LCEL Chain

pipe operator |

invoke

output

Output

Structured result

compare_arrowsLlamaIndex vs LangChain: index-centric vs chain-centric architecture

BVector Database Architecture ComparisonHow Pinecone, Qdrant, Weaviate, and pgvector differ under the hood

cloud

Pinecone

Serverless pods

speed

Qdrant

Rust + HNSW

hub

Weaviate

Go + modules

database

pgvector

PostgreSQL ext

analyticsBenchmarks: QPS, latency, recall at different scales (1M, 10M, 100M vectors)

CEmbedding Provider InternalsAPI design, batching, rate limits, and cost optimization

batch_prediction

Batch API

Efficient embedding

cache

cached

Local Cache

Avoid re-embedding

serve

dns

Self-Host

TEI / vLLM

attach_moneyCost analysis: API vs self-hosted embedding at different scales

DPlatform Integration PatternsHow Bedrock, Azure AI Search, and Vertex AI wire RAG together

cloud_upload

Ingest

S3 / Blob / GCS

chunk

auto_fix_high

Auto Pipeline

Managed chunking

query

smart_toy

Generate

With citations

securityEnterprise: IAM, encryption, VPC, compliance (SOC2, HIPAA, GDPR)

EEvaluation & Observability InternalsHow Ragas, LangSmith, and tracing work under the hood

science

Ragas Metrics

LLM-as-judge

trace

monitoring

LangSmith

Trace every step

alert

bug_report

Debug

Find failures

FMigration & Integration PatternsSwitching components, avoiding lock-in, and building for change

swap_horiz

Abstraction Layer

Swap components

migrate

sync

Re-embed

New model migration

test

verified

Validate

A/B test quality