DAIR.AI的Elvis Saravia将微软SkillOpt论文集成到智能体编排器中后,所有智能体技能获得测试框架与自我演化机制。应用于多模态论文图表提取技能时,质量评分从0.73提升至0.93(+20点),提取结果显著改善。Saravia认为这是自我改进AI的早期范例,该思路可扩展至智能体模式优化、工具使用、上下文工程、智能体搜索及工作流评估等环节。他已基于SkillOpt启动多项后续实验。
This SkillOpt paper from Microsoft is a must-read!
(bookmark it)
I was a bit skeptical of the results reported in the paper when I shared it a few days ago.
However, I managed to integrate it into my agent orchestrator and ran a few experiments.
The results are mindblowing.
Essentially, all my agent skills now have a proper testing framework and a way to self-evolve. I have started to improve all my agent skills with this.
One exciting result was when I applied it to my paper-figure-extraction skill, which requires an agent to do multimodal analysis. In particular, it improved quality by +20 points (0.73 → 0.93). I went to see the extracted tables and figures, and I was absolutely stunned by how much better my skill got at the task.
Self-improving AI is in the early days, but I think this work is a clear example of the current ability of agents to self-improve.
In this case, it was skills, but it's not hard to imagine how this scales to optimizing agent patterns, tool use, context engineering efforts, agentic search, workflows, evals, and even the harness itself. I already started with a few of these ideas inspired by SkillOpt.
Stay tuned!