Reranking

Process retrieval published

Also known as: Reranker, Cross-Encoder Reranking, Second-Stage Ranking

Definition

A second-stage ranking process that takes initial retrieval results and reorders them using a more sophisticated (and expensive) relevance model. Rerankers typically use cross-encoder models that jointly consider the query and each document, providing better relevance judgments than embedding similarity alone. The trade-off: reranking is slower but more accurate.

What this is NOT

  • Not the first-stage retrieval (reranking operates on already-retrieved results)
  • Not filtering (reranking reorders; filtering removes)
  • Not the same as embedding similarity (rerankers use different architectures)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

Using a cross-encoder model (Cohere Rerank, bge-reranker, etc.) to rescore and reorder documents retrieved by a fast first-stage retriever. Typically applied to top-k results (e.g., retrieve 50, rerank to top 5).

Sources: Cohere Rerank documentation, bge-reranker, cross-encoder models, LlamaIndex reranking documentation

information-retrieval

The second stage of a two-stage retrieval pipeline where a lightweight first stage (BM25, bi-encoder) retrieves candidates, and a heavyweight second stage (cross-encoder, learned ranker) reranks them.

Sources: Learning to rank literature, MS MARCO retrieval benchmarks

Examples

  • Retrieve 100 documents with vector search, rerank to top 10 with Cohere Rerank
  • Using bge-reranker-large to improve legal document retrieval
  • Cross-encoder reranking for question-answering over documentation

Counterexamples

Things that might seem like Reranking but are not:

  • Vector similarity ranking alone (that's first-stage, not reranking)
  • Filtering by metadata (that's filtering, not ranking)
  • LLM-based relevance scoring (possible but usually meant for evaluation)

Relations

Implementations

Tools and frameworks that implement this concept: