I found that langchain has a method where you create a custom wrapper that adds scores to document metadata, but this feels wrong to me. Why would similarity scores become permanent metadata? What’s the actual difference between using invoke() versus similarity_search_with_score()?
I’m using langchain version 0.2. Also wondering how to get scores when working with BM25Retriever and EnsembleRetriever from langchain.retrievers module.
Here’s the deal: invoke() works with all retriever types in langchain but gives you Document objects without scores. similarity_search_with_score() is vector store specific and returns (document, score) tuples. For your FAISS setup, just call db.similarity_search_with_score(user_query, k=4) directly. Skip the retriever interface and you’ll get the raw scores. BM25Retriever and EnsembleRetriever don’t expose similarity scores through their standard interface - BM25 uses different scoring and EnsembleRetriever mixes multiple methods. You’d have to dig into the underlying components or write custom scoring. The metadata approach exists for apps that need to save scores for later processing, but I agree it’s clunky for quick similarity checks. Just use the direct vector store methods when you need scores right away.
u can also set search_type="similarity_score_threshold" when creating the retriever - something like db.as_retriever(search_type="similarity_score_threshold", search_kwargs={'score_threshold': 0.5, 'k': 4}). you’ll get score filtering but still can’t see the actual values. if u really need those scores, try accessing the vectorstore directly with retriever.vectorstore.similarity_search_with_score().