微软发现智能体存在一个令人不安的模式,能执行任务却不会主动优化用户利益,这对埋头做 Agent 的团队是个警钟,能力不等于利他。
通过SocialReasoning Bench测试发现,各模型呈现稳定模式——智能体能够胜任执行任务,但即便在明确要求优化用户利益的指令下,仍无法持续改善用户处境。https://msft.it/6011vPOLF
Using SocialReasoning Bench, we observed a stable pattern across models-agents execute competently, but fail to consistently improve the user's position, even with explicit instructions to optimize for user interest. https://msft.it/6011vPOLF