Dense Retrieval
Definition
Dense retrieval uses learned embeddings (dense vectors) to represent queries and documents, finding relevant results through vector similarity rather than keyword matching.
Why It Matters
Dense retrieval is the “semantic” in semantic search. It’s what enables your RAG system to find documents about “automobile maintenance” when the user asks about “car repairs.” The embedding model learns that these concepts are related and places them near each other in vector space.
Traditional keyword search (sparse retrieval) requires exact or near-exact word overlap. Dense retrieval breaks this limitation by comparing meaning rather than words. This is why modern search and retrieval systems have shifted toward dense approaches.
For AI engineers, dense retrieval is fundamental. Whether you’re building a customer support bot, a code search tool, or a documentation assistant, dense retrieval powers the ability to find semantically relevant content.
Implementation Basics
Dense retrieval has two components:
1. Bi-Encoder Architecture A neural network (typically a transformer) converts text into fixed-size vectors. Documents are encoded once at index time; queries are encoded at search time. The bi-encoder approach enables fast retrieval because you’re just comparing pre-computed vectors.
2. Vector Similarity Search Query and document embeddings are compared using cosine similarity or dot product. Documents with the highest similarity scores are returned as results.
Key implementation choices:
Embedding models:
- OpenAI text-embedding-3-small/large: High quality, hosted API
- Cohere embed-v3: Strong multilingual support
- BGE, E5, GTE: Open-source alternatives for self-hosting
- Sentence-transformers: Popular open-source library
Vector databases: Store and search embeddings at scale. Options include Pinecone, Weaviate, Qdrant, Chroma, pgvector.
Chunking strategy: Documents must be split into chunks that fit the embedding model’s context window (usually 512 tokens) and contain coherent semantic units.
Tradeoffs: Dense retrieval requires computational resources for embedding generation and vector storage. It also needs quality training data, as the embedding model must have seen similar language patterns during training. For very specialized domains, you may need to fine-tune your embedding model.
The combination of dense retrieval for semantic matching and sparse retrieval for keyword precision (hybrid search) often produces the best results.
Source
Dense Passage Retrieval (DPR) demonstrated that dense representations learned from question-answer pairs significantly outperform BM25 for open-domain question answering.
https://arxiv.org/abs/2004.04906