Back to Glossary
Implementation

Vector Database

Definition

A vector database is a specialized database optimized for storing and searching high-dimensional embedding vectors, using algorithms like HNSW to find similar items in milliseconds across millions of vectors.

Why It Matters

Traditional databases excel at exact matches. Find users where email equals “x@y.com”, instant. But semantic similarity (“find documents about pricing strategy”) requires comparing against every stored item. With millions of vectors, brute-force search takes too long.

Vector databases solve this with approximate nearest neighbor (ANN) algorithms. They sacrifice perfect accuracy for dramatic speed improvements, finding the top 10 most similar items in milliseconds instead of minutes. The “approximate” part matters less than you’d think; ANN consistently returns results that are 95-99% as good as exact search.

For AI engineers building RAG systems, a vector database is essential infrastructure. It’s where your embeddings live and how your retrieval pipeline finds relevant context.

Implementation Basics

Popular Options

  • Pinecone: Managed service, easy to start, scales well
  • Weaviate: Open-source with modules for ML models
  • Chroma: Lightweight, embedded option great for development
  • pgvector: PostgreSQL extension, use your existing database
  • Qdrant: Open-source with good performance characteristics

Key Concepts

  • Index: The data structure (usually HNSW) that enables fast search
  • Metadata filtering: Narrow search to specific categories before vector similarity
  • Namespaces/collections: Logical separation of different data types
  • Dimensionality: Must match your embedding model (e.g., 1536 for OpenAI)

Architecture Decisions For prototypes, Chroma or pgvector keeps things simple. For production at scale, dedicated vector databases (Pinecone, Qdrant, Weaviate) handle the operational complexity better. The choice depends on scale, existing infrastructure, and whether you need hybrid search (combining vector + keyword).

Practical Tips Always store metadata alongside vectors. You’ll need it for filtering and displaying results. Plan for reindexing when you change embedding models. Monitor query latency as data grows; you may need to adjust index parameters.

Source

HNSW (Hierarchical Navigable Small World) graphs enable efficient approximate nearest neighbor search with logarithmic complexity.

https://arxiv.org/abs/1603.09320