Large language models
Reranker (Cross-Encoder)
A reranker is a cross-encoder model that scores query/document pairs jointly (concatenated as input), producing a relevance score per pair. Used as a second pass after retrieval: retrieve top-100 by cheap method, rerank to top-10 with the cross-encoder.
Cross-encoders dramatically outperform bi-encoder dense retrieval on relevance — typically +5 to +15 NDCG@10 — because they let the query and document attend to each other. The cost is compute: scoring 100 pairs is 100× slower than the original retrieval.
Common open rerankers: BGE-reranker-v2-m3, Cohere Rerank (API), Jina-Reranker-v2. For local RAG, the m3 family is the standard pick.
Related terms
Reviewed by Fredoline Eruo. See our editorial policy.