What is Parent Document Retriever?

RAG

Parent Document Retriever

Definition

Parent Document Retriever is a RAG pattern that embeds small chunks for precise retrieval but returns their larger parent documents for generation, balancing retrieval precision with context richness.

Why It Matters

Small chunks retrieve precisely but lack context. Large chunks provide context but retrieve imprecisely. Parent Document Retriever solves this trade-off: use small chunks to find the right location, then expand to the parent document for the full context needed for high-quality generation.

How It Works

Indexing: Split documents into small chunks for embedding
Store Mapping: Keep links from each chunk to its parent document
Retrieval: Search finds relevant small chunks
Expansion: Return the parent documents containing those chunks
Generation: LLM uses the fuller context for better answers

When to Use

Use Parent Document Retriever when: small chunks lose important context, answers need surrounding information, you’re working with structured documents (sections, chapters), or generation quality suffers from fragmented context. This increases tokens sent to the LLM, so balance against context window limits and cost.

Why It Matters

How It Works

When to Use

🎁 Go Beyond Definitions

Related Terms

Related Articles