Anthropic内部数据：AI能力加速，任务时长翻倍周期缩至4个月

Chubby♨️@kimmonismus

2026-06-05 06:06·10天前

AI 摘要

Anthropic内部数据显示，AI模型可自主完成任务时长加速增长：Opus 3（2024年3月）约4分钟，Sonnet 3.7（2025年3月）约90分钟，Opus 4.6（2026年3月）12小时，翻倍周期从7个月缩至4个月。Claude Mythos Preview在METR中可连续工作至少16小时。工程师季度代码产出是2021–2025年均值8倍，Claude代码占代码库80%+，单个AI曾一次性修复800+API错误（相当于人力四年）。最难开放任务成功率6个月内从低点升至76%。Anthropic强调，即使模型能力冻结，100人公司通过智能体即可完成1000人工作；实际发展已超越自身指数假设，递归自我改进虽未实现，但可能比预期更快到来。

I believe the majority still doesn't understand the momentous threshold humanity is facing.

Anthropic itself states quite clearly that even if development ceased entirely， if all development were frozen， they would still witness massive societal changes：

"Even if model capabilities were frozen at today's level， we would expect major changes to occur in the world. （…） And we are still early in the diffusion of today's models into the wider economy， where a 100-person company can increasingly do the work of a 1，000-person one， because each employee will sit atop a pyramid of agents."

But there's no question of stagnation. Anthropic itself still maintains that development has exceeded its own internal assumptions. Take that statement seriously for a second and consider it. Although Anthropic models internally and assumes exponential development， even this trajectory lags behind actual development， which is even faster.

"It's happening faster than we thought， and the implications deserve greater attention."

and

"The rate at which AI models improve is accelerating. The length of tasks that they can reliably complete on their own has been doubling roughly every four months， up from an earlier trend of doubling every seven months. In March 2024， Claude Opus 3 could complete software tasks that take humans about four minutes to complete. A year later， Claude Sonnet 3.7 managed tasks that took about an hour and a half. A year after that， Claude Opus 4.6 managed 12-hour tasks.1 If this trend holds， tasks that take a skilled person days could come into range this year.

So again： there can be no question of standing still.

The models are not only getting better， they can also work autonomously for longer. Certainly numerous breakthroughs are still needed， context window is still a problem. But the most likely direction is that the models themselves will find the solutions to the underlying problems. This opens up unforeseen possibilities， and Demis Hassabi's statement that the golden age of science is not a dream， not a utopia， but a purposeful reality， is now confirmed.

And finally， it's not just Anthropic， but also OpenAI， that sees this development， considers it feasible， and is moving forward.

Most people don't know what's coming. But one thing is certain： it's coming even faster than expected. And it will be even bigger.

Myth was just the beginning.

Chubby♨️Holy moly, Anthropic is getting very serious about recursive self-improvement! One word: acceleration. Insane blog article. Tl;dr: •We are close to an AI capabl...

智能体Anthropic大佬观点现象/趋势

在 X 查看原推

Chubby♨️@kimmonismus · X