How to control and filter search agent results in Langchain Python

I’m still learning Langchain and trying to build an application that leverages search agents for web information retrieval. The main issue I’m facing is that the search results often contain irrelevant or unhelpful sources when I need specific references.

I’ve been using search agents but they don’t seem to give me the control I need over what gets searched and returned. I want to be able to guide or filter the search process to get more targeted results.

I’ve looked into custom tools but haven’t found a solution that works for my needs. What’s the best approach to implement better search result management in Python? Are there specific parameters or methods I should be using to improve the relevance of the returned information?

Had the same problem building a research tool last year. What worked for me was ditching the default search settings and using a two-stage filter instead. First, I preprocessed queries by adding domain-specific keywords and using query expansion - cut down irrelevant results big time. Then I added a post-processing filter with embeddings to check relevance against my criteria, setting a similarity threshold to dump the junk. I also tweaked the search tool config to limit sources per query and ran multiple targeted searches with better keywords. Gave me way more control without losing flexibility. Have you tried adjusting the max_results parameter or adding custom validation before the agent finishes processing?