Ch 5: Vector Stores & Indexing

Ch 5 — Vector Stores & Indexing — Under the Hood

HNSW internals, IVF, quantization, metadata indexes, and storage engines

Index ← High Level

Under the Hood

Click play or press Space to begin...

Step- / 10

AHNSW Graph ConstructionThe dominant ANN index algorithm

scatter_plot

New Vector

Insert point p

assign layer

casino

Layer Selection

Exponential decay

greedy search

route

Find Neighbors

ef_construction

connect

hub

Link Edges

M connections

searchHNSW Search: Enter at top layer, greedy descend, expand at layer 0

BIVF & Product QuantizationCluster-based indexing for billion-scale

bubble_chart

K-Means Train

nlist centroids

assign

inventory_2

Inverted Lists

Vectors per cluster

nprobe

Probe Clusters

Search nearest cells

compressProduct Quantization: Split vector into sub-vectors, quantize each to codebook ID

CQuantization TechniquesReducing memory footprint while preserving recall

straighten

Scalar Quant

float32 → int8

4x smaller

grid_on

Binary Quant

float32 → 1-bit

32x smaller

speed

Rescore

Re-rank with originals

memoryMemory Math: 1M vectors × 1536d = 6.1 GB (float32) → 1.5 GB (int8) → 192 MB (binary)

DMetadata Indexing & FilteringHow pre-filtering actually works under the hood

label

Payload Index

Inverted index on fields

bitmap

filter_alt

Filter Bitmap

Allowed vector IDs

intersect

join

ANN + Filter

Skip filtered nodes

storageStorage Engines: Qdrant (RocksDB + mmap), Weaviate (custom LSM), Pinecone (proprietary)

EWrite Path & Segment ArchitectureHow upserts reach the index

edit_note

Upsert API

Batch of vectors

WAL

history_edu

Write-Ahead Log

Durability first

flush

view_column

Segments

Immutable chunks

merge

Compaction

Background merge

FTuning Parameters & BenchmarksThe knobs that control recall vs speed

tune

HNSW Params

M, ef_construction, ef

trade-off

monitoring

Recall vs QPS

ANN benchmarks

measure

verified

Production Config

Recommended defaults