What Is RAG & Why It Matters
The knowledge cutoff problem, hallucination, and the retrieve-then-generate pattern.
Document Loading & Preprocessing
PDFs, web pages, databases, APIs — loaders, metadata extraction, and normalization.
Chunking Strategies
Fixed-size, recursive, semantic chunking, overlap, and parent-child strategies.
Embeddings: Text to Vectors
OpenAI, Cohere, BGE, E5, Matryoshka representations, and the MTEB benchmark.
Vector Stores & Indexing
Pinecone, Weaviate, Qdrant, Chroma, pgvector, HNSW, IVF, and hybrid indexes.