DeepRefine:基于强化学习的智能体知识库精炼
DeepRefine 是一种基于大语言模型的通用推理模型,旨在通过与知识库进行多轮交互,精炼其中存在的缺陷(如证据缺失、断言置信度低或指代模糊等问题),从而提升其在开放域、知识密集型下游任务中的适用性。该模型通过溯因诊断定位缺陷,并执行针对性操作以增量更新知识库。为在没有黄金参考的情况下优化精炼策略,研究引入了“超越草案增益”奖励,并采用强化学习进行端到端训练。大量实验表明,该方法能在多个强基线模型上带来一致的下游性能提升。
Agent-compiled knowledge bases provide persistent external knowledge for large language model (LLM) agents in open-ended, knowledge-intensive downstream tasks. Yet their quality is systematically limited by incompleteness, incorrectness, and redundancy, manifested as missing evidence or cross-document links, low-confidence or imprecise claims, and ambiguous or coreference resolution issues. Such defects compound under iterative use, degrading retrieval fidelity and downstream task performance. We present DeepRefine, a general LLM-based reasoning model for agent-compiled knowledge refinement that improves the quality of any pre-constructed knowledge bases with user queries to make it more suitable for the downstream tasks. DeepRefine performs multi-turn interactions with the knowledge base and conducts abductive diagnosis over interaction history, localizes likely defects, and executes targeted refinement actions for incremental knowledge base updates. To optimize refinement policies of DeepRefine without gold references, we introduce a Gain-Beyond-Draft (GBD) reward and train the reasoning process end-to-end via reinforcement learning. Extensive experiments demonstrate consistent downstream gains over strong baselines.