Cohere发布了开源权重模型Command A+,其在AI分析智能指数上的得分与Claude 4.5 Haiku持平。该模型核心优势为极低的幻觉率,在相关榜单上以86%领先,体现出模型“知其不知”的可靠性。在速度方面,其API输出速度超过GPT-5.4 nano等多款模型,但仍略逊于Gemini 3.1 Flash-Lite。模型在科学推理与代码生成等高难度任务上表现稍弱,但具备视觉推理能力,性能位于Claude 4.5 Haiku与GPT-5.4 nano之间。
Cohere launches open weights model Command A+ that achieves 37 on the Artificial Analysis Intelligence Index
The release of Command A+ places @Cohere in line with Claude 4.5 Haiku on the Intelligence Index, and just above NVIDIA Nemotron 3 Super and Gemini 3.1 Flash-Lite.
Key Takeaways:
➤ Command A+ ranks first on AA-Omniscience Non-Hallucination at 86%, ~3 percentage points ahead of the next-best model. Its AA-Omniscience Accuracy is 9%, so the headline AA-Omniscience score lands at -4, demonstrating a similar archetype to Claude 4.5 Haiku, where the model knows its limits
➤ On Cohere's API, Command A+ (~281 output tokens per second) is faster than several comparable open-weights and small to mid-sized proprietary models (e.g., GPT-5.4 nano, Claude 4.5 Haiku, and Grok 4.3), but still slower than Gemini 3.1 Flash-Lite Preview, which outputs 304 tokens per second
➤ Command A+ trails its peer set on scientific reasoning (HLE ~11%, GPQA Diamond ~76%) and on coding (Terminal-Bench Hard ~25%, SciCode ~38%), consistent with gaps on the hardest science and agentic coding benchmarks
➤ It supports visual reasoning and scores 63% on MMMU-Pro (between Claude 4.5 Haiku at 59% and GPT-5.4 nano (xhigh) at 65%)