蚂蚁集团开源Ling 2.6 1T模型，性价比与智能取得平衡

Artificial Analysis@ArtificialAnlys

2026-05-01 02:47·45天前

AI 摘要

蚂蚁集团InclusionAI实验室发布开源非推理模型Ling 2.6 1T。该模型拥有1万亿参数，在Artificial Analysis Intelligence Index上得分为34分，较前代Ling-1T提升15分，智能水平接近DeepSeek V3.2等同类模型。其在科学推理与知识任务上表现扎实，GPQA得分达75%。模型运行效率较高，执行该指数仅需约1600万输出tokens，成本效益突出，通过官方API运行全套指数成本约95美元。但其事实可靠性较弱，在AA-Omniscience基准上得分为-51分，主要因幻觉率高达92%。模型权重已在Hugging Face公开。

Ant Group has just released Ling 2.6 1T， an open weights， non-reasoning model with high cost efficiency and a reasonable intelligence tradeoff. Ling 2.6 1T scores 34 on the Artificial Analysis Intelligence Index， a 15-point jump from Ling-1T

Ling 2.6 1T is the latest model from Ant Group's @TheInclusionAI lab. Ant Group recently released Ling 2.6 Flash， a 104B total parameter non-reasoning model. Ling 2.6 1T's weights have been publicly released on Hugging Face.

Key takeaways：

➤ Comparable intelligence to similarly sized non-reasoning models： At 1T total parameters， Ling 2.6 1T sits near DeepSeek V3.2 （non-reasoning， 32） and Kimi K2.5 （non-reasoning， 37） in intelligence. This is a marked improvement from Ling-1T， which scores 19 on the Intelligence Index. However， there remains a ~10-point gap to frontier non-reasoning open weights models such as GLM-5.1 （non-reasoning， 44） and Kimi K2.6 （non-reasoning， 43）.

➤ Strong performance in scientific reasoning and knowledge： Ling 2.6 1T scores 75% on GPQA and 8% on Humanity's Last Exam （HLE）， indicating solid performance on graduate-level reasoning and knowledge recall tasks. This is comparable to DeepSeek V3.2 （non-reasoning）， which achieves 75% on GPQA and 11% on HLE.

➤ Efficient token usage： Ling 2.6 1T uses ~16M output tokens to run the Artificial Analysis Intelligence Index， making it more efficient than MiMo V2 Flash （non-reasoning， ~17M）， and significantly more efficient than GLM-5.1 （non-reasoning， ~75M） and Kimi K2.6 （non-reasoning， ~27M）

➤ Strong cost-to-intelligence positioning： At $0.30 per million input tokens and $2.50 per million output tokens on InclusionAI's first-party API， Ling 2.6 1T costs only ~$95 to run the full Artificial Analysis Intelligence Index. This positions it competitively for large-scale workloads relative to models in a similar intelligence tier.

➤ Relatively weak factual reliability： Ling 2.6 1T scores -51 on AA-Omniscience， our benchmark for factual accuracy and hallucination. This is primarily driven by a high hallucination rate （92%）， which is similar to GPT-5.5 （non-reasoning， 91%）. However， its 21% accuracy is broadly in line with comparable non-reasoning models.

Additional model details：

➤ Size： 1T total parameters

➤ Pricing： $0.30 / $2.50 per 1M input/output tokens （via Novita API）

➤ License： Weights not yet released

➤ Availability： First-party API through InclusionAI

开源生态评测/基准

在 X 查看原推

Artificial Analysis@ArtificialAnlys · X

2026-05-01 02:47·45天前

AI 摘要

Key takeaways：