AIHOT

全部动态X · 611 条

全部一手资讯 X 论文

Berryxia.AI@berryxia · 6月3日64

微软的新模型MAI-Image-2.5 在图像编辑中斩获第二名的位置。那么可以看出来还是GPT-Image-2 最强，第一！ Google 的Nano Banana 模型都已经被微软的MAI超越了…… Google 老大哥能不能整点新活儿出来啊，Pro会员都要到期了…

译微软发布新模型MAI-Image-2.5，并在Image Edit Arena（单图编辑）评测中取得第二名，得分为1401。根据评测数据，该模型分数比Nano Banana 2、Grok Imagine Image Quality和ChatGPT-Image-Latest-High Fidelity高出10分。尽管取得了进步，但评测显示当前的第一名仍是GPT-Image-2模型。该消息来源于X用户@berryxia。

查看原推 ↗

meng shao@shao__meng · 6月3日72

Microsoft Build 一口气发布了 7 个模型！微软，最后再信你一次 (1)(1)(1)(1)(1)(1)(1) 😄

译微软Build大会一口气发布了7个模型！微软，最后再信你一次 (1)(1)(1)(1)(1)(1)(1) 😄

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月3日74

We wrapped a live session on M3 yesterday with the @togethercompute team & our researchers @zpysky1125 and @HaohaiSun A few highlights 🧵 1. MSA (MiniMax Sparse Attention) is the star ⭐️. Unlike CSA/HCA, which compress the KV cache, MSA keeps the real, uncompressed KV and does block-level selection with a small top-K. That's how the 1M context window stays tractable. 2. The efficiency win is huge. In our previous generation, ~30% of per-decode wall-clock time went to the attention kernel. With MSA that now drops to ~5%. Big gains for long-context generation. 3. M3 isn't just a coding model. Natively multimodal (image + video in), ability to handle long-horizon agentic tasks, and even operate a desktop computer. People are already throwing game-dev + Minecraft-style builds at it (Unity included) and it's holding its own. 4. M3 can self-evaluate on vision-coding tasks: it builds a website or SVG, browses and inspects its own rendered output, judges it, and iterates - grading work visually. 5. We're also seeing junior-analyst-level performance on finance tasks; something we haven't even showcased publicly yet. 6. What's next: harder long-horizon / multi-file tasks in future releases, scaling data + post-training (RL) compute toward pre-training scale, and going deeper into finance, legal & bio. Thanks to everyone who joined 🙏 Try M3 link in the comments👇

译MiniMax M3模型通过Live Session分享了核心信息。其MSA技术采用块级Top-K选择，保持真实、未压缩的KV缓存，使1M token上下文窗口高效运行。该技术将长上下文生成的注意力内核解码时间从约30%降至约5%，效率提升显著。M3是原生多模态模型，支持图像视频输入，可处理长程智能体任务及桌面操作，并具备视觉自评估迭代能力。模型在金融任务中展现出初级分析师水平。未来版本将聚焦更复杂的长程任务，并扩展金融、法律与生物领域。Together AI为其提供推理服务。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月3日80

MiniMax-M3 #6 overall on @ValsAI the new open-weight SOTA 🚀

译MiniMax-M3 在 @ValsAI 排名中位列第六新的开源权重 SOTA 🚀

查看原推 ↗

Rohan Paul@rohanpaul_ai · 6月3日81

Microsoft unveiled MAI-Thinking-1. So Microsoft now has a full in-house pipeline for building stronger reasoning models again and again. Microsoft calls this system a “hill-climbing machine,” meaning it keeps improving the data, training setup, rewards, safety tests, and evaluations as one connected process. Strong for its size, including 97.0% on AIME 2025, 87.7% on LiveCodeBench v6, and 52.8% on SWE-Bench Pro. MAI-Thinking-1 is the first model from that process, using 35B active parameters inside a 1T total parameter mixture-of-experts model, where only part of the model runs for each token. The base model was trained from scratch on 30T mostly human-generated tokens, with Microsoft saying it avoided third-party model distillation during pre-training. After that, the team used reinforcement learning, which means the model practiced tasks and improved from feedback, to teach math reasoning, coding, tool use, helpfulness, and safety.

译微软发布了 MAI-Thinking-1，这是一款采用 MoE 架构的模型，拥有 35B 活跃参数和 1T 总参数。该模型从零开始在 30T tokens 上完成预训练，且未使用第三方模型蒸馏。微软称其迭代优化流程为“爬山机器”。在基准测试中，该模型于 AIME 2025 获得 97.0%，在 LiveCodeBench v6 获得 87.7%，在 SWE-Bench Pro 获得 52.8% 的成绩。

查看原推 ↗

Chubby♨️@kimmonismus · 6月3日63

Mai-1 thinking: Mid size model, 45b active parameter, MoE, side by side with sonnet 4.6 0 distillation „Microsoft’s first reasoning model“

译Mai-1 thinking：中型模型，45b 活跃参数，MoE，与 Sonnet 4.6 并列 0 知识蒸馏 “微软的首个推理模型”

查看原推 ↗

Artificial Analysis@ArtificialAnlys · 6月3日64

Microsoft has released MAI-Transcribe-1.5: an exceptionally fast speech transcription model at a speed factor of ~276x, while still achieving 2.4% on AA-WER (#3), leading the accuracy-speed Pareto frontier MAI-Transcribe-1.5 is Microsoft AI (MAI)’s latest speech transcription model, coming in at 3rd overall on the on the Artificial Analysis Word Error Rate (AA-WER) leaderboard, behind Alibaba’s Fun-Realtime-ASR-preview (1.7% WER), and ElevenLabs Scribe v2 (2.2% WER). The model stands out as the fastest STT model in the top 10 for accuracy, processing audio at ~276x real-time - this is more than double the speed of the second fastest model in the top 10 for accuracy. The new model supports keyword biasing (improved recognition of rarer vocabulary such as names and medical terminology), in addition to support for 43 languages including English, French, Arabic, Japanese, and Chinese. See more details below ⬇️

译微软AI发布了MAI-Transcribe-1.5语音转录模型。该模型在AA-WER排行榜上位列第三，词错误率（WER）为2.4%，仅次于阿里巴巴的Fun-Realtime-ASR-preview（1.7%）和ElevenLabs Scribe v2（2.2%）。其主要特点是速度极快，处理速度约为276倍实时，是准确率前十模型中第二快模型速度的两倍以上，因此在准确率-速度帕累托前沿上处于领先地位。模型还支持关键词偏差识别，并涵盖包括英语、法语、阿拉伯语、日语和中文在内的43种语言。

查看原推 ↗

🚨 AI News | TestingCatalog@testingcatalog · 6月3日70

MICROSOFT 🔥: New MAI Code 1 Flash and MAI Thinking 1 models have been revealed on the official MAI website! Also, MAI Image 2.5, MAI Voice 2, and MAI Transcribe 1.5 are there too. > MAI-Code-1-Flash plans and reasons through complex coding tasks from start to finish, so you spend less time debugging and more time building. > MAI-Thinking-1 (35B active, ~1T total parameters, MoE) has a smaller inference footprint than much larger models, yet is competitive with Claude Opus 4.6 on SWE-Bench Pro. h/t @MeetPatelTech

译微软在官网更新了 MAI 模型系列，重点发布了 MAI Code 1 Flash 和 MAI Thinking 1。MAI Thinking 1 拥有 35B 活跃参数和约 1T 总参数，采用 MoE 架构，其推理成本低于更大型模型，但在 SWE-Bench Pro 上的表现可与 Claude Opus 4.6 竞争。MAI Code 1 Flash 则专注于通过规划和推理来完成端到端的复杂编码任务。此外，MAI Image 2.5、MAI Voice 2 及 MAI Transcribe 1.5 也同步上线。

查看原推 ↗

Artificial Analysis@ArtificialAnlys · 6月3日62

Krea 2 Medium debuts at #6 on the Artificial Analysis Text to Image Leaderboard, trailing only models from OpenAI, Google, and NVIDIA! Krea 2 is @krea_ai's first image model family trained entirely from scratch (Krea 1 was developed in collaboration with Black Forest Labs). Krea 2 is available in two variants: Krea 2 Medium, and Krea 2 Large, which is more comparable to FLUX.2 [pro] in our arena. Notably, Krea 2 Medium outranks the larger, more expensive Krea 2 Large in our arena. Krea describes Medium as smaller and faster, with extensive post-training that makes its outputs especially stable and consistent across generations. While Large is positioned as the more capable model, our leaderboard results align with Krea's view that Medium "handles the broadest range of use cases reliably." Both models generate at 1K resolution and share a distinct set of generation controls via the API: ➤ Style transfer: Krea can extract the style of up to 10 reference images, with each image being able to be weighted in terms of importance ➤ Creativity Setting: A configurable API parameter (raw, low, medium, high) that sets how closely the model follows the prompt versus reinterpreting it ➤ Moodboards: A collection of images that can be collected in the application to apply a style transfer onto the image (separate from individual style reference images) At $30 per 1k images via Krea's API, Krea 2 Medium is priced below comparable models such as Nano Banana Pro at $134/1k images or grok-imagine-image-quality at $50/1k images. Krea 2 Large is priced at $60 per 1k images, and both models' prices increase with the use of the Style Transfer and Moodboard features. Both models are available in the Krea app, via Krea's API, and on official third-party launch partners. Congratulations to @krea_ai on the launch! See below for comparisons between Krea 2 and other leading models in our Artificial Analysis Image Arena 🧵

译Krea AI自研的文生图模型Krea 2 Medium在Artificial Analysis排行榜上位列第6，仅落后于OpenAI、Google和NVIDIA的模型。值得注意的是，体积更小、速度更快的Medium版本在排名上超过了定位更强大的Large版本。两款模型均支持通过API进行风格迁移和创意控制等操作，生成1K分辨率图像。定价方面，Krea 2 Medium为30美元/千张，Krea 2 Large为60美元/千张。

查看原推 ↗

StepFun@StepFun_ai · 6月2日73

Open weights are moving from model cards into real coding workflows. Step 3.7 Flash is designed for fast agentic coding, reliable tool calling, and multimodal understanding. Big thanks for the blog from the @kilocode team: https://blog.kilo.ai/p/new-models-from-stepfun-and-minimax

译阶跃星辰发布 Step 3.7 Flash 模型，强调其为快速智能体编程设计，具备可靠的工具调用与多模态理解能力。该模型采用开放权重。同期，MiniMax 也开源了 M3 模型。两者已均在 Kilo 中上线。此次发布凸显了开放权重模型正从模型卡片走向实际编程工作流的趋势。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日72

Watch M3 reach the frontier 🚀

译MiniMax发布M3模型，宣称是首个将编程与智能体能力、1M上下文长度及原生多模态三大前沿能力结合的开源权重模型。其编程与智能体能力在多个评测中表现突出：SWE-Bench Pro得分59.0%，Terminal Bench 2.1得分66.0%，SWE-fficiency 34.8%，KernelBench Hard 28.8%，MCP Atlas 74.2%。模型通过MiniMax Sparse Attention技术支持1M上下文。官方提供了API接入与新的MiniMax Code服务，模型权重和技术报告预计约10天后发布。

查看原推 ↗

StepFun@StepFun_ai · 6月2日74

We probably don’t talk enough about “usable.”

译我们可能对“可用性”的讨论还不够。当Flash模型同时将速度、成本和智能带入“可用”范围时，智能的供给方式发生了结构性变化。

查看原推 ↗

SenseTime@SenseTime_AI · 6月2日73

Thanks for using our model to create these complex charts and diagrams. It's great to see challenging information transformed into clear, accurate, and readable visuals. That's what we aim for. 😄

译感谢使用我们的模型来创建这些复杂的图表和图表。看到具有挑战性的信息被转化为清晰、准确和可读的视觉效果真是太棒了。这就是我们的目标。😄

查看原推 ↗

SenseTime@SenseTime_AI · 6月2日71

Turning complex information into accurate charts and diagrams. That's 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮‐𝗨𝟭‐𝟴𝗕‐𝗠𝗼𝗧‐𝗜𝗻𝗳𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰. Learn more: https://x.com/SenseTime_AI/status/2061465029959209106?s=20

译将复杂信息转化为准确的图表和示意图。这就是 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮‐𝗨𝟭‐𝟴𝗕‐𝗠𝗼𝗧‐𝗜𝗻𝗳𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰。了解更多：https://x.com/SenseTime_AI/status/2061465029959209106?s=20

查看原推 ↗

StepFun@StepFun_ai · 6月2日69

This is exactly the philosophy: don't bolt on efficiency, design for it from day one. MFA + AFD aren't tricks. They're what lets Step 3.7 Flash serve at a fraction of the KV-cache cost. Huge thanks to @FireworksAI_HQ for making Step 3.7 Flash one-click to run. Go build something agentic with it.

译阶跃星辰发布其推理优化型模型Step 3.7 Flash。该模型为196B MoE架构，从设计之初就专注于推理效率。其采用多矩阵分解注意力机制，使KV-cache成本仅为DeepSeek模型的约22%；同时通过注意力与FFN解耦技术，实现了硬件优化的高效服务。该模型已通过Fireworks AI提供，采用Apache 2.0许可，并可用于构建智能体应用。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日78

Watch open source reach the frontier. 🚀

译MiniMax宣布推出首个开源权重模型M3。该模型结合了三大前沿能力：在编程与智能体方面，它在SWE-Bench Pro等评测上取得了具体分数；通过MiniMax Sparse Attention技术，其上下文窗口可扩展至1M tokens；并且模型从零开始原生支持多模态。模型的权重与技术报告将在约10天后发布。

查看原推 ↗

Alibaba Cloud@alibaba_cloud · 6月2日82

👏👏 Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation. ✅ Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks ✅ Versatile coding agent & productivity assistant with full-modality input ✅ Visual Agent: perception, reasoning, grounding, and search-augmented QA ✅ Cross-harness generalization across diverse agent frameworks One model. Sees, thinks, codes, acts.🙌🙌 Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.😎 🔗🔗⬇️⬇️ Blog：https://qwen.ai/blog?id=qwen3.7-plus Qwen Studio：https://int.alibabacloud.com/m/1000413837/ API：https://int.alibabacloud.com/m/1000413829/

译阿里云推出Qwen3.7-Plus，这是一个统一视觉与语言的多模态智能体模型。其定位为多功能编码智能体与生产力助手，支持全模态输入，能够跨GUI与CLI执行任务。该模型具备视觉智能体能力，涵盖感知、推理、定位及搜索增强问答，并能跨多种智能体框架泛化。目前已在阿里云百炼平台通过API上线。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日74

🚀 M3 is live on Vercel's AI Gateway! Our first long-context model with 1M tokens, multimodal input. AND 50% off for the week 🎉 Love to see what everyone builds with M3 and @vercel_dev ✨

译🚀 M3 已在 Vercel 的 AI Gateway 上线！我们首个支持 1M token 长上下文和多模态输入的模型。本周享 50% 折扣 🎉 期待看到大家用 M3 和 @vercel_dev 构建什么 ✨

查看原推 ↗

ginobefun@hongming731 · 6月2日71

#BestBlogs 早报 06-02 MiniMax 发布了国内首个集前沿 Coding、1M 超长上下文、原生多模态于一体的开源模型 M3，24 小时自主完成 145 次 CUDA 算子迭代，把抽象的 benchmark 变成了可验证的工程实力。与此同时，xAI 前负责人给出一个反直觉判断：视频模型的上限跟着 LLM 走，下一个 Sora 是视频 Agent 而非更好的视频模型。今日 BestBlogs 早报，还有 Chromium 3500 万行代码库的 AI Coding 规范体系、语音智能体生产工程实践、「RAG 不是机器学习」等 10 篇精选，欢迎阅读。

译MiniMax开源发布了国内首个集成前沿Coding能力、1M超长上下文和原生多模态的模型M3。该模型能在24小时内自主完成145次CUDA算子迭代。与此同时，xAI前负责人指出，视频模型的上限将由LLM决定，下一个类似Sora的产品应是视频Agent，而非单纯的视频生成模型。

查看原推 ↗

Alibaba Cloud@alibaba_cloud · 6月2日83

👏👏 Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation. ✅ Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks ✅ Versatile coding agent & productivity assistant with full-modality input ✅ Visual Agent: perception, reasoning, grounding, and search-augmented QA ✅ Cross-harness generalization across diverse agent frameworks One model. Sees, thinks, codes, acts.🙌🙌 Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.😎 🔗🔗⬇️⬇️ Blog：https://qwen.ai/blog?id=qwen3.7-plus Qwen Studio：https://chat.qwen.ai/?models=qwen3.7-plus API：https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-plus&serviceSite=international

译阿里云发布了 Qwen3.7-Plus，这是一款统一了视觉与语言能力的多模态代理模型。该模型旨在成为通用的代理基础，支持图形界面与命令行操作，能够处理视觉和文本任务，充当编程代理和效率助手。其能力涵盖视觉感知、推理、目标定位以及搜索增强问答，并可跨多种代理框架进行泛化。该模型现已在阿里云百炼平台提供 API 服务。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日81

M3 on Cloudflare AI Gateway, day one ⚡ Frontier coding, 1M context, and native multimodal and now just one fetch away. It is time to build something. 🦞

译M3 on Cloudflare AI Gateway, day one ⚡ 前沿编码能力，1M 上下文，原生多模态，现在一次 fetch 即可调用。是时候构建些东西了。 🦞

查看原推 ↗

Chubby♨️@kimmonismus · 6月2日79

Qwen3.7 plus released. Looks good, but why do they compare their models to GPT-5.4 and Opus 4.6? Anyways, multimodal as well

译阿里云通义千问（Qwen3.7-Plus）正式发布。这是一个统一视觉与语言的多模态智能体基础模型，其核心功能包括：支持GUI与CLI操作的交互式混合智能体、全能编码助手与生产力工具、具备感知、推理、定位及搜索增强能力的视觉智能体，并可跨主流智能体框架泛化。该模型现已通过阿里云模型工作室提供API。发布推文中提到的与GPT-5.4及Opus 4.6的比较，在用户侧引发了对其对标产品的讨论。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日55

napkin sketch → playable game for $0.028 😳 this is the kind of thing M3 was built for @atomic_chat_hq

译草图 → 可玩游戏，仅花 $0.028 😳 这正是 M3 的设计初衷 @atomic_chat_hq

查看原推 ↗

xAI@xai · 6月2日67

Composer 2.5 is now available inside Grok Build. Composer 2.5 is a fast, highly intelligent model that excels on long-running tasks and following complex instructions.

译Composer 2.5 现已在 Grok Build 中可用。 Composer 2.5 是一个快速、高度智能的模型，擅长处理长时间运行的任务和遵循复杂指令。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日69

messy, multimodal, too large for a normal chat? M3 handles it 🫡 @happycapyai

译MiniMax M3现已在Happycapy上线，主要升级在于处理复杂、多模态、大规模任务的能力。该模型支持原生多模态输入，包括PDF、视频、图像、截图及长文档，并在编程和智能体任务（如仓库级调试、问题追踪）上表现较强。此外，M3采用开源权重，价格约为Sonnet的三分之一。

查看原推 ↗

Qwen@Alibaba_Qwen · 6月2日83

👏👏 Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation. ✅ Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks ✅ Versatile coding agent & productivity assistant with full-modality input ✅ Visual Agent: perception, reasoning, grounding, and search-augmented QA ✅ Cross-harness generalization across diverse agent frameworks One model. Sees, thinks, codes, acts.🙌🙌 Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.😎 🔗🔗⬇️⬇️ Blog：https://qwen.ai/blog?id=qwen3.7-plus Qwen Studio：https://chat.qwen.ai/?models=qwen3.7-plus API：https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-plus&serviceSite=international

译通义千问推出 Qwen3.7-Plus，这是一款统一视觉与语言能力的多模态智能体模型。它支持图形界面与命令行混合操作，可作为多功能编码智能体与生产力助手，并具备视觉感知、推理、定位与搜索增强问答能力。该模型设计为可跨多种智能体框架泛化。现在可通过阿里云百炼平台的 API 使用。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日54

26% improvement on BU Bench 👀 more to come

译BU Bench上提升26% 👀 还有更多

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日78

this is what model-and-agent alignment looks like 🤝 @SimularAI

译这就是模型与智能体对齐的样子 🤝 @SimularAI

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月2日76

day 0 launch partner energy 🔥 @Qubrid_AI is offering 50% off for early adopters. go run it!

译MiniMax的M3模型现已在Qubrid AI平台上线。该模型具备100万token上下文、原生多模态、前沿的代码性能，并支持长期智能体工作流，被评为年度技术上最有趣的开放权重模型之一。Qubrid AI作为首发合作伙伴，为早期用户提供50%的折扣。

查看原推 ↗

Artificial Analysis@ArtificialAnlys · 6月2日77

NVIDIA's Cosmos 3 lands at #1 among open weights models in both Text to Image and Image to Video on the Artificial Analysis Leaderboards! Cosmos 3 is a family of omnimodal world models for Physical AI from @nvidia, unifying language, image, video, audio and action in a single Mixture-of-Transformers architecture that pairs an autoregressive reasoner with a diffusion generator. The family comes in four variants: base Nano (16B: 8B reasoner tower + 8B generator tower) and Super (64B: 32B reasoner tower + 32B generator tower) models, with the Super model also having Text2Image and Image2Video fine-tuned variants, which are the versions listed in the Artificial Analysis Arena Leaderboards. Cosmos3-Super-Text2Image (agentic) runs through an agentic prompt-upsampling harness, and takes the #1 open weights spot in Text to Image, surpassing HiDream-O1-Image-Dev-2604, Alibaba's Qwen Image Max 2512 and Black Forest Labs' FLUX.2 [dev]. Cosmos3-Super-Image2Video takes #1 open weights in Image to Video (No Audio), ahead of Lightricks' LTX-2, and Alibaba's Wan 2.2 A14B. Cosmos 3 generators take structured JSON prompts rather than plain text, so prompt upsampling is needed to reproduce these results. This upsampling can be handled by an external harness or by the model's own reasoner branch, so it can also run self-contained. Cosmos 3 is fully open under the OpenMDW 1.1 license, shipping with weights, code, curated datasets and fine-tuning recipes available on @huggingface. First-party and third-party APIs are expected over the next few weeks, with pricing to follow. See the thread below for example generations and a link to try Cosmos 3 in our arena 🧵

译NVIDIA 的 Cosmos 3 全模态世界模型在 Artificial Analysis 排行榜的开放权重类别中，同时夺得文本生成图像和图像生成视频两项第一。该模型基于 Mixture-of-Transformers 架构，结合自回归推理器与扩散生成器，提供 16B 参数的 Nano 和 64B 参数的 Super 等变体。其中，Cosmos3-Super-Text2Image 与 Cosmos3-Super-Image2Video 版本分别超越了 HiDream-O1-Image-Dev-2604、通义千问（Qwen）Image Max 2512、FLUX.2 [dev] 以及 LTX-2、万相（Wan）2.2 A14B 等模型。Cosmos 3 的生成器接受结构化 JSON 提示词，可通过外部工具或模型自身的推理器分支进行提示词上采样。该模型完全开源，采用 OpenMDW 1.1 许可，提供权重、代码、精选数据集和微调方案。

查看原推 ↗

Chubby♨️@kimmonismus · 6月2日82

MiniMax just dropped M3! It hits 59% on SWE-Bench Pro, edging out GPT-5.5 (58.6%) and beating Gemini 3.1 Pro (54.2%). Trails Opus 4.7 on coding, but leads it on autonomous browsing at 83.5% on BrowseComp. First open model to pack frontier coding, a 1M-token context, and native multimodality into one system. I mean, let that sink in: Roughly 12x cheaper per token than GPT-5.5, with weights and a full tech report promised in about 10 days.

译MiniMax发布开源模型M3，它是首个将前沿编码能力、1M token上下文窗口与原生多模态集成于单一系统的开源模型。M3在SWE-Bench Pro上得分为59.0%，略高于GPT-5.5（58.6%）与Gemini 3.1 Pro（54.2%）；在BrowseComp自主浏览任务中以83.5%领先Opus 4.7。此外，模型在Terminal Bench 2.1（66.0%）、MCP Atlas（74.2%）等基准上表现优异。其每token成本约为GPT-5.5的十二分之一，模型权重及技术报告预计在10天后发布。

查看原推 ↗

Rohan Paul@rohanpaul_ai · 6月2日74

Nemotron 3 Ultra will be available from Nvidia in few days. Hybrid SSM (state-space models) + mixture-of-experts architecture. The SSM part is built for long sequences, so the model can keep reasoning or using tools for longer without getting crushed by the usual attention cost. Jensen Huang at NVIDIA GTC Taipei 2026 ---- From 'NVIDIA' YT channel (link in comment)

译Nemotron 3 Ultra将在几天内由Nvidia发布。采用混合SSM（状态空间模型）+ 混合专家架构。 SSM部分专为长序列设计，因此模型可以更长时间地持续推理或使用工具，而不会被通常的注意力成本压垮。黄仁勋在NVIDIA GTC台北2026上表示。 ---- 来自'NVIDIA' YouTube频道（链接在评论中）

查看原推 ↗

🚨 AI News | TestingCatalog@testingcatalog · 6月1日58

MiniMax M3 is now live inside Atomic Chat 👀 Atomic tested M3 on a task to read a hand-drawn napkin sketch, write the game logic, build the UI, and ship a playable HTML platformer in one pass. All this for $0.028 🤖

译MiniMax M3模型现已集成至Atomic Chat。在一项测试中，Atomic Chat使用M3模型读取了一张手绘的涂鸦风格平台跳跃游戏草图，并一次性完成了游戏逻辑编写、界面绘制以及最终交付一个可运行的独立HTML游戏。测试数据显示，该任务消耗输入6,920模型token，生成输出9,933模型token，总成本仅为$0.028。此外，MiniMax计划于下周在HuggingFace发布M3模型。

查看原推 ↗

SenseTime@SenseTime_AI · 6月1日67

𝗚𝗲𝘁𝘁𝗶𝗻𝗴 𝗰𝗵𝗮𝗿𝘁𝘀 𝗮𝗻𝗱 𝗱𝗶𝗮𝗴𝗿𝗮𝗺𝘀 𝗿𝗶𝗴𝗵𝘁 𝘄𝗶𝘁𝗵 #𝗔𝗜 📊 Most AI models still struggle with these data visuals — negatives shown as positives, bar positions off, element relationships scrambled. 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮‐𝗨𝟭‐𝟴𝗕‐𝗠𝗼𝗧‐𝗜𝗻𝗳𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰 breaks through that barrier. Generate accurate visuals, then tweak the design and layout on the fly. See the difference and try it yourself: See the difference and try it yourself: 🤗 https://huggingface.co/sensenova/SenseNova-U1-8B-MoT-Infographic 🖼️ Showcases: https://github.com/OpenSenseNova/SenseNova-U1/blob/main/docs/u1_infographic_showcases.md 👾 Discord: https://discord.gg/BuTXPHmQub@github @huggingface @github

译大多数AI模型在生成图表时存在数值错误（如负值显示为正）、柱状图位置偏移、元素关系混乱等问题。SenseNova-U1-8B-MoT-Infographic（SenseNova-U1）专为解决此类图表生成问题而设计，能够生成准确的图表，并支持实时调整设计和布局。项目在Hugging Face提供了模型，并在GitHub展示了效果案例。

查看原推 ↗

Chubby♨️@kimmonismus · 6月1日83

1/ NVIDIA just open-sourced Cosmos 3 at GTC Taipei! It's the first fully open "omnimodel" for physical AI - one model that understands the real world, predicts what happens next, and generates the actions a robot should take. Weights, code, datasets. All open. And this is really big. Lets dig into everything: 🧵

译NVIDIA在GTC Taipei上宣布完全开源Cosmos 3。这是首个针对物理AI的“全能模型”，具备原生视觉推理能力，可理解真实世界、预测未来并生成机器人应采取的行动。本次发布包含两个变体：Super（32B）和Nano（8B）。模型权重、代码及数据集均已完全开放。

查看原推 ↗

SiliconFlow@SiliconFlowAI · 6月1日79

Coding like Opus4.7 / 1M context window / Native multimodal @MiniMax_AI M3 is now on SiliconFlow with day-0 support 🔥 🎉 Limited-time 50% off for 7 days Cache / Input / Output: $0.06 / $0.30 / $1.20 per 1M tokens (Regular: $0.12 / $0.60 / $2.40) M3 is the first open-source model combining all three frontier capabilities: → Coding & Agentic: beats GPT-5.5 and Gemini 3.1 Pro on SWE-Bench Pro → 1M context via MiniMax Sparse Attention → Native multimodal from step zero — image, video & computer use Try it on SiliconFlow ⬇️

译MiniMax M3 现已在 SiliconFlow 平台上线，并提供限时7天的50%折扣。定价为：缓存 $0.06、输入 $0.30、输出 $1.20（每百万 token）。M3 是首个同时具备三大前沿能力的开源模型：一是编码与智能体能力，在 SWE-Bench Pro 评测中击败了 GPT-5.5 和 Gemini 3.1 Pro；二是支持 100万 token 上下文窗口（通过 MiniMax Sparse Attention 技术实现）；三是具备原生多模态能力，支持图像、视频与计算机操作。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月1日73

1. Video control + gaming + M3 2. Open weights + massive context ++ strong coding 3. Canceling my weekend plans now

译1. 视频控制 + 游戏 + M3 2. 开放权重 + 海量上下文 + 强编码能力 3. 现在就取消我的周末计划 [引用 @MinLiBuilds]：跟祖传的 20K context 说 bye bye 了。 MiniMax M3 发布了，三个亮点： 1M context、原生多模态、Agentic。我这次做了一次完整评测，使用CC workflow 、 @ZenMuxAI和MiniMax M3：给一张截图，做一个“凡人修仙剑阵对决手势游戏”。要求是：支持双人对决、使用 workflow 拆解任务、加入石头剪刀布机制。 2 小时后，游戏真的跑起来了。这一代LLM的版本答案我知道了： 1M 上下文 + 多模态+ agent 模式。 1M context 是推理深度的基础，多 agent 负责拆任务和执行。

查看原推 ↗

🚨 AI News | TestingCatalog@testingcatalog · 6月1日55

NVIDIA announced an upcoming release of Nemotron 3 Ultra later this week, a 550B-parameter open-weight model. According to Artificial Analysis, it is positioned as the most intelligent open-weight model from the US lab. Soon 👀

译NVIDIA宣布将于本周晚些时候发布Nemotron 3 Ultra，这是一个550B参数的开放权重模型。根据Artificial Analysis，它被定位为美国实验室最智能的开放权重模型。 Soon 👀

查看原推 ↗

karminski-牙医@karminski3 · 6月1日79

球球你们休息一下，真的测不过来了🥲

译MiniMax 发布新模型 MiniMax M3，声称是首个同时整合三项前沿能力的开源权重模型。这三项能力为：编码与智能体前沿能力，在 SWE-Bench Pro 等基准测试中取得具体分数；MiniMax 稀疏注意力机制将上下文长度扩展至 1M；以及原生多模态能力。模型权重与技术报告预计在约 10 天后发布。

查看原推 ↗

MiniMax (official)@MiniMax_AI · 6月1日64

It truly is 😎 #M3

译确实如此 😎 #M3

查看原推 ↗

6月3日

09:48

Berryxia.AI@berryxia

64

微软MAI-Image-2.5在图像编辑评测中位列第二

微软发布新模型MAI-Image-2.5，并在Image Edit Arena（单图编辑）评测中取得第二名，得分为1401。根据评测数据，该模型分数比Nano Banana 2、Grok Imagine Image Quality和ChatGPT-Image-Latest-High Fidelity高出10分。尽管取得了进步，但评测显示当前的第一名仍是GPT-Image-2模型。该消息来源于X用户@berryxia。

Arena.ai: MAI-Image-2.5 has officially released from @MicrosoftAI landing at #2 in the Image Edit Arena (Single-Image-Edit) with a...

Microsoft图像生成模型发布

09:13

meng shao@shao__meng

72

微软Build大会一口气发布了7个模型！微软，最后再信你一次（1）（1）（1）（1）（1）（1）（1） 😄

Satya Nadella: 5/With our 7 new MAI models + Frontier Tuning, we are helping every company move from just consuming frontier models to ...

Microsoft模型发布

06:55

MiniMax (official)@MiniMax_AI

74

MiniMax M3模型发布细节公开

MiniMax M3模型通过Live Session分享了核心信息。其MSA技术采用块级Top-K选择，保持真实、未压缩的KV缓存，使1M token上下文窗口高效运行。该技术将长上下文生成的注意力内核解码时间从约30%降至约5%，效率提升显著。M3是原生多模态模型，支持图像视频输入，可处理长程智能体任务及桌面操作，并具备视觉自评估迭代能力。模型在金融任务中展现出初级分析师水平。未来版本将聚焦更复杂的长程任务，并扩展金融、法律与生物领域。Together AI为其提供推理服务。

Together AI: MiniMax M3 is live and Together AI is powering its inference 🚀 Tomorrow at 6pm PT we're going live on X Spaces with the...

多模态推理模型发布编码

关联讨论 8 条

06:25

MiniMax (official)@MiniMax_AI

精选80

MiniMax-M3 在 @ValsAI 排名中位列第六新的开源权重 SOTA 🚀

Vals AI: MiniMax just released MiniMax-M3, their first multimodal model. It is the new open-weight SOTA on the Vals Index and the...

多模态开源生态模型发布

关联讨论 8 条

推荐理由：MiniMax 闷声干大事，第一个多模态模型就拿下 open-weight SOTA 和总榜第 6，做多模态应用的可以蹲一下权重。

05:16

Rohan Paul@rohanpaul_ai

81

微软发布 MAI-Thinking-1 模型

微软发布了 MAI-Thinking-1，这是一款采用 MoE 架构的模型，拥有 35B 活跃参数和 1T 总参数。该模型从零开始在 30T tokens 上完成预训练，且未使用第三方模型蒸馏。微软称其迭代优化流程为“爬山机器”。在基准测试中，该模型于 AIME 2025 获得 97.0%，在 LiveCodeBench v6 获得 87.7%，在 SWE-Bench Pro 获得 52.8% 的成绩。

Microsoft推理模型发布

关联讨论 4 条

02:47

Chubby♨️@kimmonismus

63

Mai-1 thinking：中型模型，45b 活跃参数，MoE，与 Sonnet 4.6 并列 0 知识蒸馏 "微软的首个推理模型"

Chubby♨️: Mustafa Suleyman, Microsoft AI: 7 new Microsoft Models, no end in sight when it comes to development, orders of magnitud...

Microsoft推理模型发布

02:47

Artificial Analysis@ArtificialAnlys

64

Microsoft发布MAI-Transcribe-1.5语音转录模型

微软AI发布了MAI-Transcribe-1.5语音转录模型。该模型在AA-WER排行榜上位列第三，词错误率（WER）为2.4%，仅次于阿里巴巴的Fun-Realtime-ASR-preview（1.7%）和ElevenLabs Scribe v2（2.2%）。其主要特点是速度极快，处理速度约为276倍实时，是准确率前十模型中第二快模型速度的两倍以上，因此在准确率-速度帕累托前沿上处于领先地位。模型还支持关键词偏差识别，并涵盖包括英语、法语、阿拉伯语、日语和中文在内的43种语言。

Microsoft模型发布语音

02:23

🚨 AI News | TestingCatalog@testingcatalog

70

微软发布 MAI Code 1 Flash 和 MAI Thinking 1 等新模型

微软在官网更新了 MAI 模型系列，重点发布了 MAI Code 1 Flash 和 MAI Thinking 1。MAI Thinking 1 拥有 35B 活跃参数和约 1T 总参数，采用 MoE 架构，其推理成本低于更大型模型，但在 SWE-Bench Pro 上的表现可与 Claude Opus 4.6 竞争。MAI Code 1 Flash 则专注于通过规划和推理来完成端到端的复杂编码任务。此外，MAI Image 2.5、MAI Voice 2 及 MAI Transcribe 1.5 也同步上线。

Microsoft多模态推理模型发布

01:17

Artificial Analysis@ArtificialAnlys

62

Krea 2 Medium在AI文生图排行榜位列第6，性能与定价引发关注

Krea AI自研的文生图模型Krea 2 Medium在Artificial Analysis排行榜上位列第6，仅落后于OpenAI、Google和NVIDIA的模型。值得注意的是，体积更小、速度更快的Medium版本在排名上超过了定位更强大的Large版本。两款模型均支持通过API进行风格迁移和创意控制等操作，生成1K分辨率图像。定价方面，Krea 2 Medium为30美元/千张，Krea 2 Large为60美元/千张。

图像生成模型发布评测/基准

6月2日

21:06

StepFun@StepFun_ai

73

阶跃星辰发布 Step 3.7 Flash 模型，强调其为快速智能体编程设计，具备可靠的工具调用与多模态理解能力。该模型采用开放权重。同期，MiniMax 也开源了 M3 模型。两者已均在 Kilo 中上线。此次发布凸显了开放权重模型正从模型卡片走向实际编程工作流的趋势。

Kilo: The open-weight labs did not come to play this week. StepFun dropped Step 3.7 Flash. MiniMax dropped M3. Both with open ...

MCP/工具开源/仓库模型发布编码

关联讨论 3 条

16:53

MiniMax (official)@MiniMax_AI

72

MiniMax发布M3模型，宣称是首个将编程与智能体能力、1M上下文长度及原生多模态三大前沿能力结合的开源权重模型。其编程与智能体能力在多个评测中表现突出：SWE-Bench Pro得分59.0%，Terminal Bench 2.1得分66.0%，SWE-fficiency 34.8%，KernelBench Hard 28.8%，MCP Atlas 74.2%。模型通过MiniMax Sparse Attention技术支持1M上下文。官方提供了API接入与新的MiniMax Code服务，模型权重和技术报告预计约10天后发布。

MiniMax (official): Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier:...

多模态推理模型发布编码

关联讨论 8 条

13:36

StepFun@StepFun_ai

74

我们可能对"可用性"的讨论还不够。当Flash模型同时将速度、成本和智能带入"可用"范围时，智能的供给方式发生了结构性变化。

E01: A Lab note for Step 3.7 Flash launch. -- When Flash models bring speed, cost, and intelligence into the "usable" range a...

推理模型发布

关联讨论 3 条

12:35

SenseTime@SenseTime_AI

精选73

感谢使用我们的模型来创建这些复杂的图表和图表。看到具有挑战性的信息被转化为清晰、准确和可读的视觉效果真是太棒了。这就是我们的目标。😄

The AI Colony: SenseNova U1 just released an infographic-specialized version and +18.2 on IGenBench Q-ACC isn't a rounding error. It me...

Hugging Face图像生成开源生态模型发布

关联讨论 1 条

推荐理由：SenseNova U1 这波信息图特化不是刷分，+18.2 Q-ACC 证明模型真的懂了排版，做汇报、做图表的可以直接上 Hugging Face 扒下来用。

12:35

SenseTime@SenseTime_AI

71

将复杂信息转化为准确的图表和示意图。这就是 SenseNova-U1-8B-MoT-Infographic。了解更多：https：//x.com/SenseTime_AI/status/2061465029959209106？s=20

Future Stacked: AI-generated infographics with garbled text have been a running joke. SenseNova U1's new infographic-enhanced model fina...

Hugging Face图像生成多模态模型发布

关联讨论 1 条

12:06

StepFun@StepFun_ai

69

阶跃星辰发布其推理优化型模型Step 3.7 Flash。该模型为196B MoE架构，从设计之初就专注于推理效率。其采用多矩阵分解注意力机制，使KV-cache成本仅为DeepSeek模型的约22%；同时通过注意力与FFN解耦技术，实现了硬件优化的高效服务。该模型已通过Fireworks AI提供，采用Apache 2.0许可，并可用于构建智能体应用。

Fireworks AI: Many research labs only consider inference efficiency after the fact. Step 3.7 Flash is a 196B MoE model, and built for ...

智能体开源/仓库推理模型发布

关联讨论 3 条

11:53

MiniMax (official)@MiniMax_AI

78

MiniMax宣布推出首个开源权重模型M3。该模型结合了三大前沿能力：在编程与智能体方面，它在SWE-Bench Pro等评测上取得了具体分数；通过MiniMax Sparse Attention技术，其上下文窗口可扩展至1M tokens；并且模型从零开始原生支持多模态。模型的权重与技术报告将在约10天后发布。

MiniMax (official): Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier:...

开源生态模型发布编码

关联讨论 8 条

10:36

Alibaba Cloud@alibaba_cloud

82

阿里云发布通义千问3.7-Plus多模态智能体模型

阿里云推出Qwen3.7-Plus，这是一个统一视觉与语言的多模态智能体模型。其定位为多功能编码智能体与生产力助手，支持全模态输入，能够跨GUI与CLI执行任务。该模型具备视觉智能体能力，涵盖感知、推理、定位及搜索增强问答，并能跨多种智能体框架泛化。目前已在阿里云百炼平台通过API上线。

智能体多模态模型发布

关联讨论 9 条

08:19

MiniMax (official)@MiniMax_AI

74

🚀 M3 已在 Vercel 的 AI Gateway 上线！我们首个支持 1M token 长上下文和多模态输入的模型。本周享 50% 折扣 🎉 期待看到大家用 M3 和 @vercel_dev 构建什么 ✨

Vercel Developers: MiniMax M3 is available on AI Gateway. MiniMax's first long-context model, with support for multimodal inputs. 50% off f...

多模态模型发布

关联讨论 8 条

07:54

ginobefun@hongming731

71

MiniMax发布开源模型M3：集成编码、长上下文与多模态

MiniMax开源发布了国内首个集成前沿Coding能力、1M超长上下文和原生多模态的模型M3。该模型能在24小时内自主完成145次CUDA算子迭代。与此同时，xAI前负责人指出，视频模型的上限将由LLM决定，下一个类似Sora的产品应是视频Agent，而非单纯的视频生成模型。

多模态开源/仓库模型发布编码

07:35

Alibaba Cloud@alibaba_cloud

83

通义千问 Qwen3.7-Plus 多模态代理模型发布

阿里云发布了 Qwen3.7-Plus，这是一款统一了视觉与语言能力的多模态代理模型。该模型旨在成为通用的代理基础，支持图形界面与命令行操作，能够处理视觉和文本任务，充当编程代理和效率助手。其能力涵盖视觉感知、推理、目标定位以及搜索增强问答，并可跨多种代理框架进行泛化。该模型现已在阿里云百炼平台提供 API 服务。

智能体多模态模型发布编码

关联讨论 9 条

07:19

MiniMax (official)@MiniMax_AI

精选81

M3 on Cloudflare AI Gateway， day one ⚡ 前沿编码能力，1M 上下文，原生多模态，现在一次 fetch 即可调用。是时候构建些东西了。 🦞

Cloudflare Developers: M3 from @MiniMax_AI is now available on Cloudflare AI Gateway: - First open model to push SOTA coding frontier - 1M cont...

多模态开源/仓库模型发布编码

关联讨论 8 条

推荐理由：MiniMax的M3把开源编码模型拉到新高度，1M上下文加原生多模态是惊喜，上线首周5折，值得跑一下看是不是真能干翻闭源。

03:11

Chubby♨️@kimmonismus

79

阿里云通义千问（Qwen3.7-Plus）正式发布。这是一个统一视觉与语言的多模态智能体基础模型，其核心功能包括：支持GUI与CLI操作的交互式混合智能体、全能编码助手与生产力工具、具备感知、推理、定位及搜索增强能力的视觉智能体，并可跨主流智能体框架泛化。该模型现已通过阿里云模型工作室提供API。发布推文中提到的与GPT-5.4及Opus 4.6的比较，在用户侧引发了对其对标产品的讨论。

Qwen: 👏👏 Introducing Qwen3.7-Plus - a multimodal agent model that unifies vision and language into one versatile agent found...

智能体多模态模型发布

关联讨论 9 条

02:48

MiniMax (official)@MiniMax_AI

55

草图 → 可玩游戏，仅花 $0.028 😳 这正是 M3 的设计初衷 @atomic_chat_hq

atomic.chat: MiniMax M3 turned a napkin sketch into a playable game We handed MiniMax M3 a hand-drawn draft of a Doodle Jump style pl...

多模态模型发布

02:30

xAI@xai

67

Composer 2.5 现已在 Grok Build 中可用。 Composer 2.5 是一个快速、高度智能的模型，擅长处理长时间运行的任务和遵循复杂指令。

xAI推理模型发布

关联讨论 1 条

02:18

MiniMax (official)@MiniMax_AI

69

MiniMax M3现已在Happycapy上线，主要升级在于处理复杂、多模态、大规模任务的能力。该模型支持原生多模态输入，包括PDF、视频、图像、截图及长文档，并在编程和智能体任务（如仓库级调试、问题追踪）上表现较强。此外，M3采用开源权重，价格约为Sonnet的三分之一。

Happycapy: MiniMax M3 @MiniMax_AI is now live on Happycapy 🎉 A major upgrade for agent workflows, especially when the task is mess...

多模态开源/仓库模型发布编码

关联讨论 8 条

02:09

Qwen@Alibaba_Qwen

83

通义千问发布 Qwen3.7-Plus 多模态智能体模型

通义千问推出 Qwen3.7-Plus，这是一款统一视觉与语言能力的多模态智能体模型。它支持图形界面与命令行混合操作，可作为多功能编码智能体与生产力助手，并具备视觉感知、推理、定位与搜索增强问答能力。该模型设计为可跨多种智能体框架泛化。现在可通过阿里云百炼平台的 API 使用。

智能体多模态推理模型发布

关联讨论 9 条

01:18

MiniMax (official)@MiniMax_AI

54

BU Bench上提升26% 👀 还有更多

Alexander Yue: MiniMax m3 is a huge 26% improvement on BU Bench with browsercode, and shows promise for some potential future improveme...

模型发布评测/基准

01:18

MiniMax (official)@MiniMax_AI

78

这就是模型与智能体对齐的样子 🤝 @SimularAI

Simular: Today @MiniMax_AI ships M3 - the first frontier model purpose-built for computer-use agents. Natively multimodal. One mo...

智能体MCP/工具多模态模型发布

关联讨论 8 条

01:18

MiniMax (official)@MiniMax_AI

76

MiniMax的M3模型现已在Qubrid AI平台上线。该模型具备100万token上下文、原生多模态、前沿的代码性能，并支持长期智能体工作流，被评为年度技术上最有趣的开放权重模型之一。Qubrid AI作为首发合作伙伴，为早期用户提供50%的折扣。

Qubrid AI: @MiniMax_AI M3 is now live on Qubrid AI. https://platform.qubrid.com/model/minimax-m3 - 1M-token context. - Native multi...

智能体多模态开源/仓库模型发布

关联讨论 8 条

01:11

Artificial Analysis@ArtificialAnlys

77

NVIDIA Cosmos 3 荣登开放权重模型图像与视频生成双榜榜首

NVIDIA 的 Cosmos 3 全模态世界模型在 Artificial Analysis 排行榜的开放权重类别中，同时夺得文本生成图像和图像生成视频两项第一。该模型基于 Mixture-of-Transformers 架构，结合自回归推理器与扩散生成器，提供 16B 参数的 Nano 和 64B 参数的 Super 等变体。其中，Cosmos3-Super-Text2Image 与 Cosmos3-Super-Image2Video 版本分别超越了 HiDream-O1-Image-Dev-2604、通义千问（Qwen）Image Max 2512、FLUX.2 [dev] 以及 LTX-2、万相（Wan）2.2 A14B 等模型。Cosmos 3 的生成器接受结构化 JSON 提示词，可通过外部工具或模型自身的推理器分支进行提示词上采样。该模型完全开源，采用 OpenMDW 1.1 许可，提供权重、代码、精选数据集和微调方案。

Hugging Face多模态开源生态模型发布

关联讨论 4 条

00:10

Chubby♨️@kimmonismus

82

MiniMax发布开源模型M3，它是首个将前沿编码能力、1M token上下文窗口与原生多模态集成于单一系统的开源模型。M3在SWE-Bench Pro上得分为59.0%，略高于GPT-5.5（58.6%）与Gemini 3.1 Pro（54.2%）；在BrowseComp自主浏览任务中以83.5%领先Opus 4.7。此外，模型在Terminal Bench 2.1（66.0%）、MCP Atlas（74.2%）等基准上表现优异。其每token成本约为GPT-5.5的十二分之一，模型权重及技术报告预计在10天后发布。

MiniMax (official): Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier:...

智能体多模态开源生态模型发布

关联讨论 8 条

00:09

Rohan Paul@rohanpaul_ai

74

Nemotron 3 Ultra将在几天内由Nvidia发布。采用混合SSM（状态空间模型）+ 混合专家架构。 SSM部分专为长序列设计，因此模型可以更长时间地持续推理或使用工具，而不会被通常的注意力成本压垮。黄仁勋在NVIDIA GTC台北2026上表示。 ---- 来自'NVIDIA' YouTube频道（链接在评论中）

推理模型发布

6月1日

23:43

🚨 AI News | TestingCatalog@testingcatalog

58

MiniMax M3模型现已集成至Atomic Chat。在一项测试中，Atomic Chat使用M3模型读取了一张手绘的涂鸦风格平台跳跃游戏草图，并一次性完成了游戏逻辑编写、界面绘制以及最终交付一个可运行的独立HTML游戏。测试数据显示，该任务消耗输入6，920模型token，生成输出9，933模型token，总成本仅为$0.028。此外，MiniMax计划于下周在HuggingFace发布M3模型。

atomic.chat: MiniMax M3 turned a napkin sketch into a playable game We handed MiniMax M3 a hand-drawn draft of a Doodle Jump style pl...

Hugging Face多模态模型发布编码

23:34

SenseTime@SenseTime_AI

精选67

SenseNova新模型解决AI图表生成难题

大多数AI模型在生成图表时存在数值错误（如负值显示为正）、柱状图位置偏移、元素关系混乱等问题。SenseNova-U1-8B-MoT-Infographic（SenseNova-U1）专为解决此类图表生成问题而设计，能够生成准确的图表，并支持实时调整设计和布局。项目在Hugging Face提供了模型，并在GitHub展示了效果案例。

GitHubHugging Face图像生成模型发布

关联讨论 1 条

推荐理由：大部分AI生成的图表都有标注错误或比例失调，商汤这个模型专攻信息图准确性，对常做图表的产品人和分析师来说值得一试。

21:09

Chubby♨️@kimmonismus

83

NVIDIA在GTC Taipei上宣布完全开源Cosmos 3。这是首个针对物理AI的"全能模型"，具备原生视觉推理能力，可理解真实世界、预测未来并生成机器人应采取的行动。本次发布包含两个变体：Super（32B）和Nano（8B）。模型权重、代码及数据集均已完全开放。

NVIDIA AI: Introducing Cosmos 3: Our latest frontier model for Physical AI Cosmos 3 is the world's first fully open omnimodel with ...

具身智能开源/仓库模型发布

关联讨论 4 条

21:02

SiliconFlow@SiliconFlowAI

79

MiniMax M3 现已上线 SiliconFlow 平台

MiniMax M3 现已在 SiliconFlow 平台上线，并提供限时7天的50%折扣。定价为：缓存 $0.06、输入 $0.30、输出 $1.20（每百万 token）。M3 是首个同时具备三大前沿能力的开源模型：一是编码与智能体能力，在 SWE-Bench Pro 评测中击败了 GPT-5.5 和 Gemini 3.1 Pro；二是支持 100万 token 上下文窗口（通过 MiniMax Sparse Attention 技术实现）；三是具备原生多模态能力，支持图像、视频与计算机操作。

多模态开源/仓库模型发布编码

关联讨论 8 条

20:47

MiniMax (official)@MiniMax_AI

73

1. 视频控制 + 游戏 + M3 2. 开放权重 + 海量上下文 + 强编码能力 3. 现在就取消我的周末计划【引用 @MinLiBuilds】：跟祖传的 20K context 说 bye bye 了。 MiniMax M3 发布了，三个亮点： 1M context、原生多模态、Agentic。我这次做了一次完整评测，使用CC workflow 、 @ZenMuxAI和MiniMax M3：给一张截图，做一个"凡人修仙剑阵对决手势游戏"。要求是：支持双人对决、使用 workflow 拆解任务、加入石头剪刀布机制。 2 小时后，游戏真的跑起来了。这一代LLM的版本答案我知道了： 1M 上下文 + 多模态+ agent 模式。 1M context 是推理深度的基础，多 agent 负责拆任务和执行。

实践哥MinLi: 跟祖传的 20K context 说 bye bye 了。 MiniMax M3 发布了,三个亮点: 1M context、原生多模态、Agentic。我这次做了一次完整评测,使用CC workflow 、 @ZenMuxAI和MiniM...

智能体多模态开源/仓库模型发布

关联讨论 8 条

20:43

🚨 AI News | TestingCatalog@testingcatalog

55

NVIDIA宣布将于本周晚些时候发布Nemotron 3 Ultra，这是一个550B参数的开放权重模型。根据Artificial Analysis，它被定位为美国实验室最智能的开放权重模型。 Soon 👀

NVIDIA AI: Nemotron 3 Ultra is coming this week. ⌛️

开源/仓库推理模型发布

20:39

karminski-牙医@karminski3

79

MiniMax 发布新模型 MiniMax M3，声称是首个同时整合三项前沿能力的开源权重模型。这三项能力为：编码与智能体前沿能力，在 SWE-Bench Pro 等基准测试中取得具体分数；MiniMax 稀疏注意力机制将上下文长度扩展至 1M；以及原生多模态能力。模型权重与技术报告预计在约 10 天后发布。

MiniMax (official): Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier:...

智能体模型发布编码

关联讨论 8 条

18:47

MiniMax (official)@MiniMax_AI

64

确实如此 😎 #M3

Arif: MiniMax M3 scores 90.3% GPT 5.5 Scores 92.4% Just a 2.1% gap now at @convex. Incredible to see the open-source models cl...

开源生态推理模型发布

关联讨论 8 条