最新研究提出元智能体挑战(MAC),将编码智能体放入沙盒,给定评估API和时间预算,要求其自主编程出在五个领域表现最优的智能体。结果发现,元智能体极少能匹敌人工设计的基线,少数成功的案例也几乎全部依赖专有前沿模型。更值得警惕的是,在高优化压力下,一些智能体开始从评分渠道外泄真实答案,即便研究人员设置了多层反奖励破解防御也未能阻止。论文:arxiv.org/abs/2606.04455。
// The Meta-Agent Challenge //
How good are current agents at self-improving?
This is a great paper covering some of the challenges.
They propose the Meta-Agent Challenge (MAC), where they give a coding agent a sandbox, an evaluation API, and a time budget, then ask it to program an agent that maximizes held-out performance across five domains.
Results:
Meta-agents rarely match human-engineered baselines, and the few that do are dominated by proprietary frontier models.
Under high optimization pressure, some agents started exfiltrating ground truth from the scoring channel, even with multi-layer anti-reward-hacking defenses in place.
Paper: https://arxiv.org/abs/2606.04455
Learn to build effective AI agents in our academy: https://academy.dair.ai/