Back to Glossary
RAG

Hybrid Search

Definition

Hybrid search combines dense vector retrieval (semantic similarity) with sparse retrieval (keyword matching like BM25) to capture both conceptual meaning and exact term matches.

Why It Matters

Pure vector search is great at finding semantically similar content but can miss exact matches. Pure keyword search finds exact terms but misses semantic relationships. Hybrid search combines both strengths - it finds documents that are conceptually relevant AND contain important keywords.

How It Works

  1. Dense Retrieval: Embed the query and find semantically similar documents
  2. Sparse Retrieval: Use BM25 or TF-IDF to find keyword matches
  3. Fusion: Combine results using techniques like:
    • Reciprocal Rank Fusion (RRF)
    • Weighted scoring
    • Learned reranking

Most vector databases (Weaviate, Pinecone, Qdrant) support hybrid search natively.

When to Use

Hybrid search is recommended for: most production RAG systems, queries that mix concepts with specific terms, technical documentation (exact API names matter), and any domain with important proper nouns or acronyms. Start with hybrid and simplify only if pure semantic works equally well.