原文 · 未翻译
Salesforce claims AI agents cut a 231-day migration to 13 days with fewer incidents
Key Points
Salesforce has shifted its entire software development process to AI agents powered by Anthropic's Claude Code, operating without token limits, marking a fundamental change in how the company builds software.
The results have been striking: developers now produce 79 percent more pull requests while simultaneously reducing error rates, with one API migration completed in just 13 days compared to an original estimate of 231 days.
Rather than writing code manually, developers have taken on a new role as orchestrators of specialized agent teams, coordinating multiple AI agents to tackle complex development tasks collaboratively.
Few topics spark as much debate right now as the "agentic shift" in coding. Instead of writing code line by line, developers orchestrate software creation through AI agents.
Salesforce is now putting its own numbers behind that shift. In a post by Srinivas Tallapragada, Salesforce's head of engineering, the company says it has moved its entire development organization to agentic workflows. They rolled out Anthropic's Claude Code across the whole company as the main AI agent and gave every developer unlimited tokens to use it.
For April 2026, Salesforce reports a sharp efficiency jump compared to the same month last year. Completed work items per developer rose 50.8 percent. Merged pull requests per developer climbed 79 percent.
An ML-based "Effective Output Score" designed to measure the actual value of shipped code improved by 151.3 percent. None of these numbers can be independently verified.
More output, fewer incidents
The obvious question, whether quality suffers at this pace, Tallapragada answers by pointing to the company's own monitoring platform, Engineering 360. Despite the surge in pull requests, incidents dropped five percent. Safety guardrails and quality standards are baked into the agentic workflow, he says.
"When agentic tools get applied properly, quality doesn't suffer from speed. It benefits from it," Tallapragada writes. Salesforce doesn't back this claim with external audits or independent measurements.
Engineers are now building their own agentic workflows rather than just using off-the-shelf tools, according to Tallapragada. So-called Claude Code skills, reusable capabilities that encode team context, naming conventions, and workflow patterns, have become a new kind of engineering artifact. Salesforce also built a curated library called "AI Expert Suite" and "Salesforce Foundation Plugins" that serves as a shared foundation for all developers.
Sub-agents and agent teams, specialized AI agents that handle parallel workstreams within a larger task, are changing how complex work gets broken down. Developers no longer bounce between five systems. They describe the desired outcome, and coordinated agents handle the individual steps.
API migration in 13 days instead of 231
As a concrete example, Tallapragada points to migrating 33 API endpoints to a new cloud-native architecture. The traditional approach would have taken about 231 person-days, the company estimates. Using a rule-based framework built on Claude with Markdown files and reference implementations, the migration was done in 13 days; 18 times faster.
Each round of PR feedback was fed back into the rule set, so accuracy kept improving. Autonomous LLM loops of building, fixing, and validating ran without manual intervention. Migrations were parallelized across isolated environments. The result: five pull requests, with the largest single PR delivering 21 endpoints with full test coverage.
"The most important skill today is knowing how to structure problems for an agentic system, when to delegate versus stay in the loop, and how to build reusable patterns your team can compound on," Tallapragada writes.
Security, junior talent, and team structure remain unsolved
Tallapragada is upfront about a range of unsolved problems, calling them "genuinely hard." Context management in long agentic sessions is a skill engineers still need to learn. The quality of CLAUDE.md files—persistent context configs that align Claude with a codebase—varies widely between teams and has a big impact on output quality. Security needs a rethink too. When agents act on systems rather than just making suggestions, the blast radius of a misconfigured tool gets much larger.
Then there's the talent pipeline question. "When agents handle more of the execution layer, how do junior engineers grow into senior engineers if AI is absorbing much of the entry-level work? What is the role of a designer or product manager in this new world?" Tallapragada writes. Salesforce is experimenting with one-person or three-person units instead of traditional Scrum teams. It doesn't have clear answers yet.
Productivity leap or tech debt on autopilot?
A sharply different take came a few days ago from well-known programmer and hacker George Hotz. Using AI agents in software development will be one of the industry's most expensive mistakes, he argues.
LLMs are "sophisticated statistical models" that "mimic the distribution of programming" but can never truly program, Hotz says. Large organizations are especially at risk because weaker developers can't spot faulty output.
Even Andrej Karpathy, who now counts himself among agentic coding's supporters, has flagged quality problems. Agent-generated code is "not like super amazing code necessarily all the time," he said, calling it "bloaty, there's a lot of copy paste, there's awkward abstractions that are brittle, and like, it works, but it's just really gross." Unlike Hotz, though, Karpathy is still sold on the new approach and recently joined Anthropic.
A broader debate about the rising costs of AI relative to its benefits is heating up too, alongside questions about what the models actually deliver in day-to-day work.
AI News Without the Hype – Curated by Humans