Hybrid Search
Definition
Hybrid search combines dense vector retrieval (semantic similarity) with sparse retrieval (keyword matching like BM25) to capture both conceptual meaning and exact term matches.
Why It Matters
Pure vector search is great at finding semantically similar content but can miss exact matches. Pure keyword search finds exact terms but misses semantic relationships. Hybrid search combines both strengths - it finds documents that are conceptually relevant AND contain important keywords.
How It Works
- Dense Retrieval: Embed the query and find semantically similar documents
- Sparse Retrieval: Use BM25 or TF-IDF to find keyword matches
- Fusion: Combine results using techniques like:
- Reciprocal Rank Fusion (RRF)
- Weighted scoring
- Learned reranking
Most vector databases (Weaviate, Pinecone, Qdrant) support hybrid search natively.
When to Use
Hybrid search is recommended for: most production RAG systems, queries that mix concepts with specific terms, technical documentation (exact API names matter), and any domain with important proper nouns or acronyms. Start with hybrid and simplify only if pure semantic works equally well.