Hybrid Search

Process retrieval published

Also known as: Hybrid Retrieval, Sparse-Dense Retrieval

Definition

A retrieval approach that combines multiple search strategies—typically keyword-based (sparse/BM25) and embedding-based (dense/vector)—to get the benefits of both. Keyword search excels at exact matches and rare terms; vector search excels at semantic similarity. Hybrid search merges their results, often using reciprocal rank fusion (RRF) or learned combination.

What this is NOT

Not just vector search (hybrid specifically combines multiple strategies)
Not ensemble of multiple vector indexes (hybrid combines sparse and dense)
Not reranking (reranking is a second stage; hybrid is parallel retrieval)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

Running both BM25/keyword search and vector similarity search, then combining results using a fusion algorithm. Most vector databases now support hybrid search natively.

Sources: Weaviate hybrid search documentation, Pinecone hybrid search, Qdrant hybrid retrieval

information-retrieval

Combining sparse retrieval (inverted index, TF-IDF, BM25) with dense retrieval (learned embeddings) to improve recall and precision across diverse query types.

Sources: Dense Passage Retrieval paper, Hybrid retrieval benchmarks

Examples

Weaviate query with alpha=0.5 balancing BM25 and vector scores
Elasticsearch combining kNN search with full-text search
Pinecone with sparse-dense vectors in a single query
RRF fusion of Elasticsearch and Qdrant results

Counterexamples

Things that might seem like Hybrid Search but are not:

Pure vector similarity search
Pure BM25/keyword search
Querying multiple vector indexes (that's ensemble, not hybrid)

Relations

overlapsWith vector-search (Vector search is one component of hybrid)
overlapsWith retrieval-augmented-generation (Hybrid search improves RAG retrieval)
overlapsWith retriever (Hybrid retrievers implement this pattern)
overlapsWith reranking (Hybrid results are often reranked)

Implementations

Tools and frameworks that implement this concept:

Elasticsearch primary
Haystack secondary
Milvus primary
pgvector primary
Pinecone primary
Qdrant primary
Weaviate primary