Pi-Serini下的代理搜索再思考:词法检索是否足够?
本研究探讨在大型语言模型(LLM)代理循环中词法检索器的有效性,引入Pi-Serini搜索代理,配备检索、浏览和阅读工具。通过将优化配置的BM25与前沿LLMs(如gpt-5.5)配对,在BrowseComp-Plus数据集上的实验显示,该方法支持深度研究,实现83.1%答案准确率和94.7%表面证据召回率,优于使用密集检索器的搜索代理。消融实验表明,BM25调优比默认设置提升答案准确率18.0%和表面证据召回率11.1%,增加检索深度比浅层检索进一步提升表面证据召回率25.3%。源代码已公开。
Does a lexical retriever suffice as large language models (LLMs) become more capable in an agentic loop? This question naturally arises when building deep research systems. We revisit it by pairing BM25 with frontier LLMs that have better reasoning and tool-use abilities. To support researchers asking the same question, we introduce Pi-Serini, a search agent equipped with three tools for retrieving, browsing, and reading documents. Our results show that, on BrowseComp-Plus, a well-configured lexical retriever with sufficient retrieval depth can support effective deep research when paired with more capable LLMs. Specifically, Pi-Serini with gpt-5.5 achieves 83.1% answer accuracy and 94.7% surfaced evidence recall, outperforming released search agents that use dense retrievers. Controlled ablations further show that BM25 tuning improves answer accuracy by 18.0% and surfaced evidence recall by 11.1% over the default BM25 setting, while increasing retrieval depth further improves surfaced evidence recall by 25.3% over the shallow-retrieval setting. Source code is available at https://github.com/justram/pi-serini.