One week since the launch of GPT-5.5, and it’s already our strongest model launch yet. API revenue is growing more than 2x faster than any prior release, while Codex doubled revenue in under seven days as enterprise demand for agentic coding tools keeps climbing.
译GPT-5.5发布已有一周,这已经是我们迄今为止最强大的模型发布。 API收入增长速度比以往任何版本都快两倍以上,而随着企业对智能编码工具的需求持续攀升,Codex在不到七天内收入翻倍。
Grok 4.3
译Grok 4.3 此次发布显示运行 Artificial Analysis Intelligence Index 的成本效益有所提高,Grok 4.3 在智能与成本的帕累托边界上表现稳健。 得益于输入 token 价格降低 37.5% 和输出 token 价格降低 58.3%,运行 Intelligence Index 评估的成本为 395 美元,较 Grok 4.20 0309 v2 整体下降约 20%。
Grok 4.3 is a very good model especially when you think its only 500m parameters! xAI's Grok 4.3 scores 53 on the Artificial Analysis Intelligence Index with ~40% lower input and ~60% lower output pricing vs Grok 4.20, making it one of the most cost-efficient models at its intelligence tier. Biggest gain: a 321-point Elo jump on real-world agentic tasks (GDPval-AA), though it still trails GPT-5.5 by a wide margin.
译xAI发布的Grok 4.3模型在Artificial Analysis Intelligence Index上获得53分,相比Grok 4.20输入成本降低约40%,输出成本降低约60%,性价比突出。其最大亮点是在真实世界代理任务(GDPval-AA)上的ELO评分跃升321点至1500,超越了Gemini 3.1 Pro Preview和Muse Spark等模型,但仍大幅落后于GPT-5.5。该模型在指令遵循和客服任务上表现强劲,同时在Omniscience基准上准确率提升但幻觉率增加。总体而言,Grok 4.3以更低成本实现了更高的智能指数得分,成为同智能层级中成本效益较高的模型之一。
Grok 4.3 is now available on the API 👀
译Grok 4.3 现已可在 API 上使用 👀
Grok
译Grok Grok-4.3 的发布价格低于 Grok-4.2,同时智能体性能大幅跃升:在 @ArtificialAnlys 的 GDPval-AA 基准上 ELO 分数提升 321 分至 1500,尽管价格更低,却超越了其他顶级模型。
The new Grok comes in below the latest Chinese open weights models, Grok 4 was at the frontier when released. (& Artificial Analysis: please stop using GDPval-AA which is not a useful test of anything except a model’s ability to impress Gemini as a judge)
译xAI发布Grok 4.3,其在Artificial Analysis智能指数得分53,性能优于Grok 4.20、Muse Spark等模型。核心改进在于“性价比”:输入与输出价格较前代分别降低约40%和60%,且基准测试套件运行成本下降。该版本在GDPval-AA等现实智能体任务上表现显著提升,指令遵循与客服任务强劲。但推文指出,其表现仍落后于最新的中国开源模型,并批评GDPval-AA测试本身价值有限。
OpenRouter 又上了匿名新模型Owl Alpha! 1M 上下文,强大的工具调用能力! 猜猜他是谁家的哈哈😂
译OpenRouter 又上了匿名新模型Owl Alpha! 1M 上下文,强大的工具调用能力! 猜猜他是谁家的哈哈😂
The new Grok-4.3 from @xai is live on OpenRouter! Grok-4.3 releases at a lower price than Grok-4.2, while seeing a large jump in agentic performance: a 321 point increase to 1500 ELO on @ArtificialAnlys GDPval-AA, surpassing other top models despite the lower price.
译@xai 的新模型 Grok-4.3 现已在 OpenRouter 上线! Grok-4.3 以比 Grok-4.2 更低的价格发布,同时在代理性能上实现大幅跃升:在 @ArtificialAnlys 的 GDPval-AA 基准上 ELO 分数提升 321 点至 1500,尽管价格更低,但仍超越了其他顶级模型。
xAI has launched Grok 4.3, achieving 53 on the Artificial Analysis Intelligence Index with improved agentic performance, ~40% lower input price, and ~60% lower output price than Grok 4.20 The release of Grok 4.3 places @xAI just above Muse Spark and Claude Sonnet 4.6 on the Intelligence Index, and a 4 points ahead of the latest version of Grok 4.20. Grok 4.3 improves its Artificial Analysis Intelligence Index score while reducing cost to run the benchmark suite. Key Takeaways: ➤ Grok 4.3 improves on cost-per-intelligence relative to Grok 4.20 0309 v2: it scores higher on the Intelligence Index while costing less to run the full benchmark suite. Grok 4.3 costs $395 to run the Artificial Analysis Intelligence Index, around 20% lower than Grok 4.20 0309 v2, despite using more output tokens. This makes it one of the lower-cost models at its intelligence level ➤ Large increase in real world agentic task performance: The largest single benchmark improvement is on GDPval-AA, where Grok 4.3 scores an ELO of 1500, up 321 points from Grok 4.20 0309 v2’s score of 1179 Grok 4.3, surpassing Gemini 3.1 Pro Preview, Muse Spark, Gpt-5.4 mini (xhigh), and Kimi K2.5. Grok 4.3 narrows the gap to the leading model on GDPval-AA, but still trails GPT-5.5 (xhigh) by 276 Elo points, with an expected win rate of ~17% against GPT-5.5 (xhigh) under the standard Elo formula ➤ Grok 4.3’s performs strongly on instruction following and agentic customer support tasks. It gains 5 points on 𝜏²-Bench Telecom to reach 98%, in line with GLM-5.1. Grok 4.3 maintains an 81% IFBench score from Grok 4.20 0309 v2 ➤ Gains 8 points on AA-Omniscience Accuracy, but at the cost of lower AA-Omniscience Non-Hallucination Rate of 8 points, so Grok 4.20 0309 v2 still leads AA-Omniscience Non-Hallucination Rate, followed by MiMo-V2.5-Pro, in line with Grok 4.3 Congratulations to @xAI and @elonmusk on the impressive release!
译xAI推出Grok 4.3模型,其在Artificial Analysis智能指数得分达53,超越Muse Spark等模型,较前代提升4分。模型在显著降低成本的同时保持智能水平,输入与输出价格分别降低约40%和60%。在真实世界智能体任务上表现突出,GDPval-AA基准得分大幅提升至1500 ELO,超越Gemini 3.1 Pro Preview等多款模型,但仍落后于GPT-5.5 (xhigh)。其在指令遵循和客服任务上表现强劲,但AA-Omniscience非幻觉率略有下降。
Ant Group has just released Ling 2.6 1T, an open weights, non-reasoning model with high cost efficiency and a reasonable intelligence tradeoff. Ling 2.6 1T scores 34 on the Artificial Analysis Intelligence Index, a 15-point jump from Ling-1T Ling 2.6 1T is the latest model from Ant Group’s @TheInclusionAI lab. Ant Group recently released Ling 2.6 Flash, a 104B total parameter non-reasoning model. Ling 2.6 1T’s weights have been publicly released on Hugging Face. Key takeaways: ➤ Comparable intelligence to similarly sized non-reasoning models: At 1T total parameters, Ling 2.6 1T sits near DeepSeek V3.2 (non-reasoning, 32) and Kimi K2.5 (non-reasoning, 37) in intelligence. This is a marked improvement from Ling-1T, which scores 19 on the Intelligence Index. However, there remains a ~10-point gap to frontier non-reasoning open weights models such as GLM-5.1 (non-reasoning, 44) and Kimi K2.6 (non-reasoning, 43). ➤ Strong performance in scientific reasoning and knowledge: Ling 2.6 1T scores 75% on GPQA and 8% on Humanity’s Last Exam (HLE), indicating solid performance on graduate-level reasoning and knowledge recall tasks. This is comparable to DeepSeek V3.2 (non-reasoning), which achieves 75% on GPQA and 11% on HLE. ➤ Efficient token usage: Ling 2.6 1T uses ~16M output tokens to run the Artificial Analysis Intelligence Index, making it more efficient than MiMo V2 Flash (non-reasoning, ~17M), and significantly more efficient than GLM-5.1 (non-reasoning, ~75M) and Kimi K2.6 (non-reasoning, ~27M) ➤ Strong cost-to-intelligence positioning: At $0.30 per million input tokens and $2.50 per million output tokens on InclusionAI’s first-party API, Ling 2.6 1T costs only ~$95 to run the full Artificial Analysis Intelligence Index. This positions it competitively for large-scale workloads relative to models in a similar intelligence tier. ➤ Relatively weak factual reliability: Ling 2.6 1T scores -51 on AA-Omniscience, our benchmark for factual accuracy and hallucination. This is primarily driven by a high hallucination rate (92%), which is similar to GPT-5.5 (non-reasoning, 91%). However, its 21% accuracy is broadly in line with comparable non-reasoning models. Additional model details: ➤ Size: 1T total parameters ➤ Pricing: $0.30 / $2.50 per 1M input/output tokens (via Novita API) ➤ License: Weights not yet released ➤ Availability: First-party API through InclusionAI
译蚂蚁集团InclusionAI实验室发布开源非推理模型Ling 2.6 1T。该模型拥有1万亿参数,在Artificial Analysis Intelligence Index上得分为34分,较前代Ling-1T提升15分,智能水平接近DeepSeek V3.2等同类模型。其在科学推理与知识任务上表现扎实,GPQA得分达75%。模型运行效率较高,执行该指数仅需约1600万输出tokens,成本效益突出,通过官方API运行全套指数成本约95美元。但其事实可靠性较弱,在AA-Omniscience基准上得分为-51分,主要因幻觉率高达92%。模型权重已在Hugging Face公开。
Ecosystem-first approach continued! Ling-2.6-1T officially landed on @huggingface and the official inference is now live via @novita_labs. Experience the efficiency of Ling-2.6-1T for yourself, front and center on HF model card page! 🔥
译AntLingAGI团队宣布Ling-2.6-1T模型正式开源,已登陆Hugging Face平台,并通过Novita Labs提供官方推理体验。该模型采用混合专家架构,总参数1万亿、激活参数630亿,核心优化方向为“令牌效率”以满足真实生产需求。具体表现为:低令牌开销,能在无需冗长推理链的情况下保持强大智能;可靠的多步执行能力,提升指令、工具、上下文和工作流的控制水平;生产就绪的部署特性,覆盖从代码生成到错误修复的任务,并广泛兼容各类智能体框架。团队旨在通过降低测试、部署、定制和构建的难度,为开发者创造价值。
Last week, we made Gemini Embedding 2, our first natively multimodal embedding model, available to the general public. Since then, developers have used it to build video analysis tools, visual shopping assistants, and more. But you might be wondering... what is an embedding model? 🤔 Let’s break it down! 1. What is it? Think of an embedding model as a "universal translator." It takes text, images, video, and audio data and turns them into a long string of numbers, like a unique digital fingerprint. 2. How does it work? Historically, search has been text only. Now, instead of just matching data by keyword, Gemini Embedding 2 maps multiple modalities in the same space based on meaning. It "feels" the connection between a video of a soccer goal and the words "game-winning shot" without needing tags. For example, "ocean" and "waves" are placed close together, but "ocean" and "toaster" are miles apart. 3. How can you use it? Developers have been using it to incorporate smarter search functionality into their builds. This means creating tools where you can snap a photo of a product and type "find this in yellow," or search through thousands of hours of video by describing what happens in a scene. 4. Ready to try it out for yourself? You can start using it today via the Gemini API or the Gemini Enterprise Agent Platform.
译谷歌上周正式向公众发布了其首个原生多模态嵌入模型Gemini Embedding 2。该模型如同“通用翻译器”,能将文本、图像、视频和音频数据转化为独特的数字向量。其核心突破在于不再依赖关键词匹配,而是基于语义将不同模态的数据映射到同一空间,从而理解内容间的深层联系。开发者已利用该模型构建视频分析工具、视觉购物助手等应用,实现通过拍照或描述场景进行智能搜索的功能。模型现可通过Gemini API或Gemini Enterprise Agent平台使用。
APPLE 🍎: “AFM Plus 150B Instruct” Apple Foundation Model has been spotted in the internal AFM Playground app. This app is being used internally by Apple employees to test Apple Foundation models. WWDC26 will be hot 🔥
译苹果 🍎: “AFM Plus 150B Instruct” Apple Foundation Model 在内部的 AFM Playground 应用中被发现。 这个应用正在被苹果员工内部使用,以测试 Apple Foundation 模型。 WWDC26 将会很火爆 🔥
ANTHROPIC 🚨: Anthropic started testing a new "claude-jupiter-v1-p" model with red teams. Who is next? 👀
译ANTHROPIC 🚨: Anthropic 已开始与红队测试新的 "claude-jupiter-v1-p" 模型。 下一个会是谁?👀
𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨𝟭 𝗟𝗶𝘁𝗲 𝗦𝗲𝗿𝗶𝗲𝘀: 𝗦𝗺𝗮𝗹𝗹 𝗦𝗰𝗮𝗹𝗲, 𝗕𝗶𝗴 𝗖𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝘆 A new generation of natively unified multimodal models, delivering commercial-grade performance at a compact 8B / A3B scale: • 𝗖𝗼𝗺𝗽𝗹𝗲𝘅 𝗶𝗻𝗳𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 with strong semantic integrity and pixel level precision • 𝗛𝗶𝗴𝗵 𝗹𝗮𝘆𝗼𝘂𝘁 𝗰𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝘆 with 𝗮𝗰𝗰𝘂𝗿𝗮𝘁𝗲 𝗮𝗻𝗱 𝗿𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝘁𝗲𝘅𝘁 𝗿𝗲𝗻𝗱𝗲𝗿𝗶𝗻𝗴 • 𝗜𝗻𝗱𝘂𝘀𝘁𝗿𝘆-𝗳𝗶𝗿𝘀𝘁 𝗰𝗼𝗻𝘁𝗶𝗻𝘂𝗼𝘂𝘀 𝗶𝗺𝗮𝗴𝗲–𝘁𝗲𝘅𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻, enabling unified reasoning and consistent visual style Now fully open-sourced: 𝗚𝗶𝘁𝗛𝘂𝗯: https://github.com/OpenSenseNova/SenseNova-U1 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲: https://huggingface.co/collections/sensenova/sensenova-u1 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨1 𝗦𝗸𝗶𝗹𝗹𝘀: https://github.com/OpenSenseNova/SenseNova-Skills 𝗗𝗶𝘀𝗰𝗼𝗿𝗱: https://discord.gg/cxkwXWjp @huggingface @github
译SenseNova U1 Lite Series是新一代原生统一的多模态模型,在紧凑的8B/A3B规模下提供商业级性能。其核心能力包括复杂信息图生成,具备强语义完整性和像素级精度;高布局一致性,实现准确可靠的文本渲染;以及行业首创的连续图像-文本生成,支持统一推理和一致视觉风格。该模型现已完全开源,相关代码和资源可通过GitHub、Hugging Face等平台获取。
New stealth model: Owl Alpha! Owl is a high-performance foundation model designed for agentic workloads. Powerful tool use capabilities and a 1M context window, ready for use in all your favorite productivity apps. Try it now and share feedback to improve the model!
译全新隐形模型:Owl Alpha! Owl 是一款专为智能体工作负载设计的高性能基础模型。具备强大的工具使用能力和 100 万上下文窗口,可随时用于您喜爱的所有生产力应用。 立即试用并分享反馈以改进模型!
Qwen3.6-Plus is now available on @togethercompute. Ship it.
译Qwen3.6-Plus 现已在 @togethercompute 上线。快来使用吧。
Sam Altman 刚宣布,OpenAI 将在未来几天向“关键网络安全防御者”推送 GPT-5.5-Cyber,一个专门为网络安全打造的前沿模型。他说 OpenAI 会和整个行业生态及政府合作,建立可信的访问机制,目标是尽快帮助保护企业和基础设施。
OpenAI built the GPT-5.5-Cyber model because Anthropic built Mythos. white-hat vs. black-hat energy.
译OpenAI 构建 GPT-5.5-Cyber 模型是因为 Anthropic 构建了 Mythos。 白帽与黑帽能量。
we're starting rollout of GPT-5.5-Cyber, a frontier cybersecurity model, to critical cyber defenders in the next few days. we will work with the entire ecosystem and the government to figure out trusted access for cyber; we want to rapidly help secure companies/infrastructure.
译我们即将在未来几天内向关键网络安全防御者推出GPT-5.5-Cyber,这是一个前沿网络安全模型。 我们将与整个生态系统及政府合作,为网络安全领域探索可信访问机制;我们希望迅速帮助企业和基础设施提升安全防护。
ERNIE 5.1 Preview just went live 🚀 With a lighter, more efficient architecture, it delivers strong performance at its scale. And this is just the start — more ERNIE model updates to come at Baidu Create 2026.
译百度ERNIE 5.1 Preview模型正式上线。该模型采用更轻量高效的架构,在总参数量压缩至前代约1/3、激活参数量约1/2的同时,仅消耗可比模型约6%的预训练成本,实现了在其规模下的领先基础性能。根据@arena的Text Arena榜单,ERNIE 5.1 Preview在全球总排名第13位,并位列中国实验室第一。其在多个细分领域进入全球前十,特别是在法律与政府领域排名第一。百度预告将在2026年的Baidu Create大会上发布更多ERNIE模型更新。
GPT-5.5 party on 5/5:
译GPT-5.5将于5月5日举办派对: [引用 @sama]:GPT-5.5要为自己办个派对。它选了5月5日下午5:55作为日期和时间。 如果想参加,请在此告知:https://luma.com/5.5 Codex将协助团队从回复中挑选参与者。5.5对派对提了些不错的想法/要求,我们会落实。
It was very much of a pleasant surprise to see all the cool demos by combing the Ling-2.6-1T with capable and well-received harnesses like @opencode. Thanks to @novita_labs for another great launch together~ 👏
译Ling-2.6-1T正式开源,来自@AntLingAGI。该模型拥有1T总参数和63B活跃参数,专为实际生产设计,具有token高效性,便于开发者测试、部署和定制。从Ling-2.6-flash升级到1T规模,实现了从快速推理到更强推理的跨越。主推文强调,结合@opencode等工具展示了酷炫演示,体现了模型与现有工具的兼容性和实用性,并对@novita_labs的合作发布表示感谢。
Thanks Adina~ Token efficiency is the key characteristic leading to the next stage. We need to burn tokens wisely and efficiently in order to make the whole industry sustainable. 🤗🤗
译谢谢Adina~ Token效率是引领下一阶段的关键特性。我们需要明智且高效地消耗token,才能使整个行业可持续发展。🤗🤗
What's the secret sauce behind the flagship instruct model built for fast execution & high efficiency at scale? Reliable infra with the proper optimizations, from the #SGLang friends at @lmsysorg 以为昨天的 100B 已经打满,今日 1T 方知,打得还可以更满~ 🥳 Onto the next optimization~ 🫡
译SGLang团队(隶属于LMSYS Org)揭示了其旗舰指令模型实现快速、高效、大规模执行的关键在于可靠的基础设施与针对性优化。团队宣布对AntLingAGI发布的Ling-2.6-1T万亿参数模型提供Day-0支持。该模型采用快速思考方法,在保持质量的同时,成本可比同类模型降低约4倍,并在AIME26和SWE-bench基准测试中达到SOTA水平。它专为高级编码、复杂推理和大规模智能体工作流设计,具备万亿参数能力与即时模型延迟。团队正持续进行优化,以进一步提升性能。
Last week, we introduced Ling-2.6-1T. Today, Ling-2.6-1T is officially an open model~ 🤗 1T total parameters · 63B active parameters We bring values to developers by making it easier to test, deploy, customize, and build. It is optimized to be "token efficiency" for real production needs: • Lower token overhead: strong intelligence without long reasoning traces • Reliable multi-step execution: better instruction, tool, context, and workflow control • Production-ready deployment: from code generation to bug fixing, with broad agent framework compatibility A sneak pick into the agentic capability in @opencode
译AntLingAGI正式开源其万亿参数旗舰模型Ling-2.6-1T。该模型采用总参数1万亿、激活参数630亿的架构,核心设计理念是“令牌高效”,旨在以极低的令牌开销实现顶尖智能。它通过“快速思考”机制优化,具备可靠的多步骤执行能力,在指令遵循、工具使用和上下文控制方面表现优异。模型为实际生产需求优化,部署便捷,兼容广泛的智能体框架,适用于从代码生成到错误修复等多种任务。
MISTRAL 🚨: Mistral AI released Mistral Medium 3.5, a 128B dense open weights model with a 256k context window and configurable reasoning effort. Mistral Medium 3.5 is now available on Mistral Vibe and Le Chat.
译MISTRAL 🚨: Mistral AI 发布了 Mistral Medium 3.5,这是一个拥有 256k 上下文窗口和可配置推理算力的 128B 密集开放权重模型。 Mistral Medium 3.5 现已在 Mistral Vibe 和 Le Chat 上可用。
IBM has released three new non-reasoning Granite 4.1 models (30B, 8B, 3B) as open weights under Apache 2.0. All three are notably token-efficient relative to peer non-reasoning models, with the 8B standing out for its token efficiency relative to intelligence @IBM has released three new instruct models in the Granite 4.1 family: Granite 4.1 30B (15 on the Intelligence Index), Granite 4.1 8B (12), and Granite 4.1 3B (9). The release continues IBM's focus on small, efficient, and open models for enterprise and edge deployment, alongside the existing Granite 4.0 Nano family (1B and 350M variants released in October 2025). The Intelligence Index is the Artificial Analysis synthesis metric incorporating 10 evaluations covering agentic tasks, coding, and scientific reasoning. Key benchmarking results: ➤ All three Granite 4.1 models score 61 on the Artificial Analysis Openness Index, standing out among peer open weights non-reasoning models. This is driven by full open weights under Apache 2.0 plus partial disclosures across pre-training data, post-training data, and training methodology. Granite 4.1 sits well above peers like Qwen3.5 (39), Gemma 4 (39) and GLM-4.7-Flash (44), and represents a meaningful improvement over the Granite 4.0 family (56), driven by stronger methodology disclosure. Olmo 3.1 and K2 Think V2 (both 89) remain leaders as the most ‘open’ models. ➤ Granite 4.1 8B uses just 4M output tokens to run the Intelligence Index. This is ~20x fewer than Qwen3.5 9B (78M tokens), ~3x fewer than Ministral 3 8B (13M), and ~2x fewer than Gemma 4 E4B (8M). The pattern holds across the family: Granite 4.1 30B uses 4.6M output tokens (vs 7M for Gemma 4 31B and 25M for Qwen3.5 27B), and Granite 4.1 3B uses 2.7M. ➤ Token efficiency comes at the cost of intelligence relative to peer non-reasoning models. Granite 4.1 30B (15) trails leading peers like Qwen3.5 27B (37) and Gemma 4 31B (32). Granite 4.1 8B (12) trails Ministral 3 8B (15) and Gemma 4 E4B (15). Granite 4.1 3B (9) trails Gemma 4 E2B (12). ➤ Granite 4.1 30B and 3B both gain on the Intelligence Index over their Granite 4.0 predecessors. Granite 4.1 30B (15) gains 4 points over Granite 4.0 H Small (32B / 9B active, 11), with the largest gains in tool use (τ²-Bench: 42% vs 17%) and agentic tasks (GDPval-AA: 493 vs 344 Elo). Granite 4.1 3B (9) gains 1 point over Granite 4.0 Micro (8). Other information: ➤ License: Apache 2.0 (open weights, permissive commercial use) ➤ Context window: 128K tokens ➤ Availability: Granite 4.1 8B is available via @WandB ($0.05/$0.1 per 1M input/output tokens) and @replicate. Weights for all three models are available via @huggingface.
译IBM发布了三款采用Apache 2.0许可的Granite 4.1开源模型(30B、8B、3B)。其核心特点是极高的令牌效率,例如8B模型运行智能指数仅需4M输出令牌,远低于同类模型。在开放性指数上,三款模型均获得61分,领先多数同行。但高效率也带来了智能指数的相对折衷,其得分低于Qwen3.5、Gemma 4等竞品。不过,与上一代Granite 4.0系列相比,新模型的智能表现仍有提升。该系列模型拥有128K令牌的上下文窗口,主要面向企业和边缘部署,可通过WandB、Replicate和Hugging Face获取。
As part of the open model release, that lightning-fast elephant-alpha you loved on @OpenRouter is here to stay. Meet Ling-2.6-flash, powered by @novita_labs for robust and cost-effective performance. Plus, enjoy a 20% discount on us starting right now! 👇 https://openrouter.ai/inclusionai/ling-2.6-flash
译此前在OpenRouter上备受喜爱的快速模型“elephant-alpha”现已永久保留并正式开源,命名为Ling-2.6-flash。该模型由novita_labs驱动,旨在提供稳健且高性价比的性能。它专为现实世界智能体工作流打造,拥有1040亿总参数和74亿活跃参数,并提供多种精度版本以适应不同部署需求。其核心优势包括高达每秒215个令牌的生成速度、仅需1500万令牌即可完成完整智能评估的高效令牌利用率,以及在编码、文档处理和轻量级智能体任务中的强大执行能力。同时,模型在中文切换和主流编码框架兼容性方面体验更佳。为庆祝发布,现提供20%的折扣。
Deepseek 的多模态模型全量了。 目前可以在网页版的识图模式尝试,看起来是一个单独的多模态模型
译Deepseek 的多模态模型全量了。 目前可以在网页版的识图模式尝试,看起来是一个单独的多模态模型
We're open-sourcing Hy-MT1.5-1.8B-1.25bit — a 440MB translation model that runs fully offline on your phone, supports 33 languages, and outperforms Google Translate. At 1.8B parameters, it matches commercial translation APIs and 235B-scale models on standard benchmarks. By quantizing to 1.25-bit, memory drops from 3.3GB (FP16) to 440MB — 25% smaller and ~10% faster than prior 1.67-bit approaches, with no accuracy loss. Covers 33 languages, 5 dialects, and 1,056 translation directions including minority languages like Tibetan and Mongolian. Our translation model has won 30 first-place rankings in international MT competitions and is already deployed across multiple Tencent products.🏆 📲Demo APK (Android): https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk 🤗Hugging Face:: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit 🔗GitHub: https://github.com/tencent/AngelSlim 📄Paper: https://arxiv.org/abs/2601.07892
译腾讯开源了Hy-MT1.5-1.8B-1.25bit翻译模型,其参数量为18亿,经量化后仅440MB,可在手机上完全离线运行。该模型支持33种语言、5种方言及1056个翻译方向,包括藏语、蒙古语等少数语言。在标准测试中,其性能媲美商业翻译API和2350亿参数的大模型。通过量化至1.25比特,模型内存占用从FP16格式的3.3GB大幅降低,比之前的1.67比特方法体积缩小25%、速度提升约10%,且无精度损失。该模型已在国际机器翻译竞赛中获得30项第一,并部署于腾讯多个产品中。
SenseTime open-sourced SenseNova-U1, a multimodal image generation model built on NEO-Unify! This architecture drops the visual encoder and VAE entirely. It generates images natively as one system that can handle understanding, reasoning, and generation processes. @SenseTime_AI 🤖
译SenseTime开源了基于NEO-Unify架构的多模态图像生成模型SenseNova-U1。该架构完全摒弃了传统视觉编码器和VAE,原生地将理解、推理和生成统一为一个系统。该系列模型(8B和A3B参数)在开源模型中效率领先,以紧凑尺寸提供商业级性能与出色成本效益。其特色功能包括原生生成图文交织内容,适用于制作指南等实用场景;并擅长高密度信息渲染,能生成知识插图、海报、PPT和漫画等丰富结构的布局。模型已在Hugging Face和GitHub等平台开源。
Thank you @liuziwei7 for co‑creating the future of #multimodal intelligence with us!
译感谢 @liuziwei7 与我们共同创造 #多模态智能 的未来!
𝗬𝗲𝘀, 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨1 𝗶𝘀 𝗻𝗼𝘄 𝗮𝘃𝗮𝗶𝗹𝗮𝗯𝗹𝗲 𝗼𝗻 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲 𝗮𝗻𝗱 𝗚𝗶𝘁𝗛𝘂𝗯! Discover how it enables complex #infographic creation with semantic precision and pixel‑level fidelity. Hugging Face: https://huggingface.co/collections/sensenova/sensenova-u1 GitHub: https://github.com/OpenSenseNova/SenseNova-U1 Discord: https://discord.gg/cxkwXWjp
译是的,SenseNova U1 现已在 Hugging Face 和 GitHub 上发布! 探索它如何以语义精确性和像素级保真度实现复杂的 #信息图 创作。 Hugging Face: https://huggingface.co/collections/sensenova/sensenova-u1 GitHub: https://github.com/OpenSenseNova/SenseNova-U1 Discord: https://discord.gg/cxkwXWjp
HappyHorse 1.0 is now live on @fal. Go build.
译HappyHorse 1.0 现已在 @fal 上线。去构建吧。 [引用 @fal]:Happy Horse 1.0 is live on fal, day 0 🐎 🎬 一流的运动质量 🎧 原生1080p,音频同步一步完成 🔗 音视频联合生成,非拼接 🔓 限制更少,商业用途更广 ⚡ 为生产规模而构建
Serving LLM well is a challenging task and it requires engineering acumen and good taste. Thanks to @Modular team's high caliber engineers for making the collaboration a reality. Ecosystem FTW! 🤠👏
译服务好LLM是一项具有挑战性的任务,它需要工程智慧和良好的品味。感谢@Modular团队高水平的工程师们让这次合作成为现实。生态系统必胜!🤠👏
Mistral Medium incoming. The only relevant european AI company is going to release another model.
译Mistral Medium 即将到来。唯一相关的欧洲AI公司将发布另一个模型。
NVIDIA 发布 Nemotron 3 Nano Omni,这是一款面向长上下文的多模态模型,能够处理文本、图像、音频和视频。 它在文档分析、自动语音识别、音视频理解及智能体计算机使用等实际应用中表现出色,并在多项基准测试中展现了领先的准确性和效率。
译NVIDIA发布了多模态模型Nemotron 3 Nano Omni,专为处理长上下文设计,能够同时理解文本、图像、音频和视频。该模型在文档分析、自动语音识别、音视频理解以及智能体计算机使用等实际应用场景中表现优异。在多项基准测试中,Nemotron 3 Nano Omni均展现出领先的准确性和效率。
This release shows increased cost efficiency to run the Artificial Analysis Intelligence Index, with Grok 4.3 sitting co...
xAI has launched Grok 4.3, achieving 53 on the Artificial Analysis Intelligence Index with improved agentic performance,...
The new Grok-4.3 from @xai is live on OpenRouter! Grok-4.3 releases at a lower price than Grok-4.2, while seeing a large...
关联讨论 3 条Hacker News 热门(buzzing.cc 中文翻译)X:Elon Musk (@elonmusk, xAI)X:xAI (@xai)xAI has launched Grok 4.3, achieving 53 on the Artificial Analysis Intelligence Index with improved agentic performance,...
New stealth model: Owl Alpha! Owl is a high-performance foundation model designed for agentic workloads. Powerful tool u...
xAI推出Grok 4.3模型,其在Artificial Analysis智能指数得分达53,超越Muse Spark等模型,较前代提升4分。模型在显著降低成本的同时保持智能水平,输入与输出价格分别降低约40%和60%。在真实世界智能体任务上表现突出,GDPval-AA基准得分大幅提升至1500 ELO,超越Gemini 3.1 Pro Preview等多款模型,但仍落后于GPT-5.5 (xhigh)。其在指令遵循和客服任务上表现强劲,但AA-Omniscience非幻觉率略有下降。
关联讨论 3 条Hacker News 热门(buzzing.cc 中文翻译)X:Elon Musk (@elonmusk, xAI)X:xAI (@xai)蚂蚁集团InclusionAI实验室发布开源非推理模型Ling 2.6 1T。该模型拥有1万亿参数,在Artificial Analysis Intelligence Index上得分为34分,较前代Ling-1T提升15分,智能水平接近DeepSeek V3.2等同类模型。其在科学推理与知识任务上表现扎实,GPQA得分达75%。模型运行效率较高,执行该指数仅需约1600万输出tokens,成本效益突出,通过官方API运行全套指数成本约95美元。但其事实可靠性较弱,在AA-Omniscience基准上得分为-51分,主要因幻觉率高达92%。模型权重已在Hugging Face公开。
关联讨论 6 条X:阿易 AI Notes (@AYi_AInotes)X:Artificial Analysis (@ArtificialAnlys)蚂蚁百灵:Developer Blog(网页)IT之家(RSS)蚂蚁 inclusionAI:HuggingFace 新模型X:蚂蚁百灵 (@AntLingAGI)Last week, we introduced Ling-2.6-1T. Today, Ling-2.6-1T is officially an open model~ 🤗 1T total parameters · 63B activ...
关联讨论 6 条X:阿易 AI Notes (@AYi_AInotes)X:Artificial Analysis (@ArtificialAnlys)蚂蚁百灵:Developer Blog(网页)IT之家(RSS)蚂蚁 inclusionAI:HuggingFace 新模型X:蚂蚁百灵 (@AntLingAGI)谷歌上周正式向公众发布了其首个原生多模态嵌入模型Gemini Embedding 2。该模型如同“通用翻译器”,能将文本、图像、视频和音频数据转化为独特的数字向量。其核心突破在于不再依赖关键词匹配,而是基于语义将不同模态的数据映射到同一空间,从而理解内容间的深层联系。开发者已利用该模型构建视频分析工具、视觉购物助手等应用,实现通过拍照或描述场景进行智能搜索的功能。模型现可通过Gemini API或Gemini Enterprise Agent平台使用。
关联讨论 2 条Google Developers Blog(RSS)X:Google AI for Developers (@googleaidevs)( #appleinternal ) Apple Internally uses an application that looks pretty similar to ChatGPT named AFM Playground, which...
SenseNova U1 Lite Series是新一代原生统一的多模态模型,在紧凑的8B/A3B规模下提供商业级性能。其核心能力包括复杂信息图生成,具备强语义完整性和像素级精度;高布局一致性,实现准确可靠的文本渲染;以及行业首创的连续图像-文本生成,支持统一推理和一致视觉风格。该模型现已完全开源,相关代码和资源可通过GitHub、Hugging Face等平台获取。
Introducing Qwen3.6-Plus from @Alibaba_Qwen, a 1M-context model built for real-world agents, agentic coding, and multimo...
Sam Altman 刚宣布,OpenAI 将在未来几天向“关键网络安全防御者”推送 GPT-5.5-Cyber,一个专门为网络安全打造的前沿模型。他说 OpenAI 会和整个行业生态及政府合作,建立可信的访问机制,目标是尽快帮助保护企业和基础设施。
we're starting rollout of GPT-5.5-Cyber, a frontier cybersecurity model, to critical cyber defenders in the next few day...
关联讨论 1 条X:Kim (@kimmonismus)we're starting rollout of GPT-5.5-Cyber, a frontier cybersecurity model, to critical cyber defenders in the next few day...
Introducing ERNIE 5.1 Preview - now live! 🚀 Ranked #13 globally and #1 among Chinese labs on @arena 's Text Arena. Top-...
关联讨论 3 条X:百度 Baidu (@Baidu_Inc)X:Berry Xia (@berryxia)X:Testing Catalog (@testingcatalog)GPT-5.5 is going to have a party for itself. it chose 5/5 at 5:55 pm for the date and time. if you'd like to come, let u...
关联讨论 1 条X:Sam Altman (@sama)Today, Ling-2.6-1T is officially open-sourced (from @AntLingAGI) 1T total parameters · 63B active parameters Built for r...
Ling-2.6-1T just dropped by @AntLingAGI , one day after Ling 2.6 Flash. Both optimized for the same goal: usable intelli...
👏 Meet Ling-2.6-1T from @AntLingAGI, the trillion-parameter flagship instant instruct model built for fast execution & ...
AntLingAGI正式开源其万亿参数旗舰模型Ling-2.6-1T。该模型采用总参数1万亿、激活参数630亿的架构,核心设计理念是“令牌高效”,旨在以极低的令牌开销实现顶尖智能。它通过“快速思考”机制优化,具备可靠的多步骤执行能力,在指令遵循、工具使用和上下文控制方面表现优异。模型为实际生产需求优化,部署便捷,兼容广泛的智能体框架,适用于从代码生成到错误修复等多种任务。
🚀 Today, we are launching Ling-2.6-1T, a trillion-parameter flagship model designed for precise instruct task execution...
关联讨论 6 条X:阿易 AI Notes (@AYi_AInotes)X:Artificial Analysis (@ArtificialAnlys)蚂蚁百灵:Developer Blog(网页)IT之家(RSS)蚂蚁 inclusionAI:HuggingFace 新模型X:蚂蚁百灵 (@AntLingAGI)Introducing remote agents in Vibe and Mistral Medium 3.5. You can now launch remote agents in the cloud, including from ...
关联讨论 2 条Hacker News 热门(buzzing.cc 中文翻译)Mistral AI:News(网页)IBM发布了三款采用Apache 2.0许可的Granite 4.1开源模型(30B、8B、3B)。其核心特点是极高的令牌效率,例如8B模型运行智能指数仅需4M输出令牌,远低于同类模型。在开放性指数上,三款模型均获得61分,领先多数同行。但高效率也带来了智能指数的相对折衷,其得分低于Qwen3.5、Gemma 4等竞品。不过,与上一代Granite 4.0系列相比,新模型的智能表现仍有提升。该系列模型拥有128K令牌的上下文窗口,主要面向企业和边缘部署,可通过WandB、Replicate和Hugging Face获取。
关联讨论 1 条Hugging Face:Blog(RSS)Ling-2.6-flash is now officially open-sourced! A fast, token-efficient Instruct model built for real-world agent workflo...
腾讯开源了Hy-MT1.5-1.8B-1.25bit翻译模型,其参数量为18亿,经量化后仅440MB,可在手机上完全离线运行。该模型支持33种语言、5种方言及1056个翻译方向,包括藏语、蒙古语等少数语言。在标准测试中,其性能媲美商业翻译API和2350亿参数的大模型。通过量化至1.25比特,模型内存占用从FP16格式的3.3GB大幅降低,比之前的1.67比特方法体积缩小25%、速度提升约10%,且无精度损失。该模型已在国际机器翻译竞赛中获得30项第一,并部署于腾讯多个产品中。
SenseNova U1 Lite Series is now open source! Built on the NEO-unify architecture, it natively unifies multimodal underst...
🔥Native Unified Multimodal Model Open Sourced🔥 🚀SenseNova U1🚀 is the first native multimodal model that unifies mult...
SenseNova U1 is out on Hugging Face https://huggingface.co/collections/sensenova/sensenova-u1
Happy Horse 1.0 is live on fal, day 0 🐎 🎬 Best-in-class motion quality 🎧 Native 1080p with synced audio in one pass �...
关联讨论 3 条X:阿里云 / Alibaba Cloud (@alibaba_cloud)X:Artificial Analysis (@ArtificialAnlys)IT之家(RSS)Ling-2.6-flash from @AntLingAGI is now open source, with day zero support on Modular Cloud! Fast MoE for agent workflows...
NVIDIA发布了多模态模型Nemotron 3 Nano Omni,专为处理长上下文设计,能够同时理解文本、图像、音频和视频。该模型在文档分析、自动语音识别、音视频理解以及智能体计算机使用等实际应用场景中表现优异。在多项基准测试中,Nemotron 3 Nano Omni均展现出领先的准确性和效率。