HNSW (Hierarchical Navigable Small World)
Definition
HNSW is the most widely used approximate nearest neighbor algorithm, building a multi-layer graph structure that enables logarithmic-complexity vector search and is the default in most vector databases.
Why It Matters
HNSW is the algorithm behind nearly every vector database you’ll use. Pinecone, Weaviate, Qdrant, Chroma, and pgvector all offer HNSW indexes. When your RAG system returns results in 50ms instead of 5 seconds, HNSW is why.
The algorithm’s genius is in its graph structure. Instead of comparing your query against every vector, HNSW navigates through a graph where similar vectors are connected. You start at a coarse level and zoom in, finding the neighborhood of relevant vectors without exhaustive search.
For AI engineers, understanding HNSW parameters helps you optimize the latency-recall tradeoff. Need faster search? Tune down ef. Need better recall? Tune it up. The algorithm gives you a dial to turn.
Implementation Basics
How HNSW works:
Layer structure: HNSW builds multiple layers. Layer 0 contains all vectors. Each higher layer contains a random subset (typically 1/M of the layer below). Top layers have few vectors and sparse connections; bottom layers are dense and well-connected.
Search process:
- Start at the top layer, find the closest vector to your query
- Use that vector as the starting point for the next layer down
- At each layer, greedily navigate toward closer neighbors
- At layer 0, perform a more thorough local search
- Return the best candidates found
Why it’s fast: Top layers let you “teleport” to the right region quickly. Bottom layers provide fine-grained search. You traverse O(log n) vectors instead of O(n).
Key parameters:
M (connections per node): How many edges each vector has. Default 16-64. Higher M = better recall, more memory. Start with 16.
efConstruction (build-time search depth): How carefully to build the graph. Default 100-200. Higher = better index quality, slower build.
efSearch (query-time search depth): How thoroughly to search. Default 40-100. Higher = better recall, slower queries. This is your main tuning knob at query time.
Practical tuning:
- Start with defaults (M=16, efConstruction=100, efSearch=40)
- Measure recall: compare ANN results to exact search on sample queries
- If recall is low, increase efSearch
- If build time is slow, decrease efConstruction
- If memory is tight, decrease M (but recall may suffer)
When HNSW isn’t right:
- Very large datasets (> 100M vectors): Consider IVF+PQ for memory
- Frequent updates: HNSW rebuild can be slow
- Write-heavy workloads: HNSW is optimized for read-heavy patterns
For most RAG applications, HNSW with default parameters just works. It’s the “correct default” for vector search.
Source
HNSW combines navigable small-world graphs with a hierarchical structure, achieving logarithmic search complexity while maintaining high recall.
https://arxiv.org/abs/1603.09320