Gemini 3.5 Flash can process complex visual data and translate it into functional, interactive code. Watch Gemini analyze lighting from a reference image, and build an interactive 3D visualizer to preview the setup.
译Gemini 3.5 Flash 能处理复杂视觉数据,并将其转化为功能性的交互式代码。 观看 Gemini 分析参考图像中的光照,并构建一个交互式 3D 可视化器来预览该设置。
What kind of issues do you run into when you are using Codex to create PDFs?
译你在使用Codex创建PDF时遇到了哪些问题?
Cohere just released North Mini Code, a small 30B parameter (3B active) open weights coding model that scores 27.6 on the Artificial Analysis Intelligence Index Less than a month since @cohere's last model release, Command A+, has launched another open weights model that is optimized for coding, and much smaller at 30B total parameters and 3B active parameters. Key Takeaways: ➤ Achieves 27.6 on the Artificial Analysis Intelligence Index, above gpt-oss-20B (high) at 24.5 and just below Mistral Small 4 (119B parameters, 6.5B active) at 27.8 ➤ Scores competitively on the Artificial Analysis Coding Index (weighted average of Terminal-Bench Hard and SciCode) against open weights models in its size class, scoring 33.4, significantly above GLM-4.7-Flash at 25.9, and below Qwen3.6 35B A3B at 35.2. However, it underperforms on non-coding agentic tasks, scoring 14% on GDPval-AA and 37% on 𝜏²-Bench Telecom ➤ On Cohere’s API, North Mini Code is faster than several comparable open weights models of its intelligence and size class (~199 output tokens per second) ➤ North Mini Code is a text-only 30B total parameter and 3B active parameter model, and is open-sourced under the Apache 2.0 license
译Cohere近日发布North Mini Code,一款30B总参数(3B活跃参数)的开放权重编码模型,采用Apache 2.0开源协议。该模型在Artificial Analysis Intelligence Index上得分27.6,高于gpt-oss-20B (high)的24.5,略低于Mistral Small 4(119B参数,6.5B活跃)的27.8。在Coding Index(Terminal-Bench Hard和SciCode加权平均)上得分33.4,显著高于GLM-4.7-Flash的25.9,低于Qwen3.6 35B A3B的35.2。非编码智能体任务表现较弱:GDPval-AA 14%、τ²-Bench Telecom 37%。在Cohere API上推理速度约199 output tokens/s,快于同类模型。距Cohere上次发布Command A+不到一个月。
Also, I found that Hermes Agent + Nemotron 3 Ultra is a mighty combo!
译Elvis Saravia(DAIR.AI)宣布推出一个以AI智能体为核心的新技能提升平台。首批上线四个动手实验:Agent Skills、Agentic Image Generation、30 Days of Hermes Agents、Prompt Engineering with Agents。Saravia指出,Hermes Agent与Nemotron 3 Ultra搭配使用效果强劲,称其为“强大的组合”。更多内容将在未来数周陆续上线。
SpatialWorld Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks
译SpatialWorld 评测多模态智能体在真实世界任务中的交互式空间推理能力
Fascinating. Google just released Gemini 3.5 Live Translate. A live speech-to-speech translation model that starts speaking in another language while the original speaker is still talking. Older translation systems often wait for a full sentence, because early words can be misleading until later words reveal tense, intent, or context. Gemini 3.5 instead runs streaming translation, where the model listens, interprets partial meaning, predicts what can safely be translated, and keeps updating as new speech arrives. supports 70+ languages, stays only a few seconds behind the speaker, and can preserve pacing, pitch, and intonation across longer sessions. Rolling out to Gemini Live API, businesses through Google Meet preview, and regular users through Google Translate on Android and iOS.
译Google 推出 Gemini 3.5 Live Translate,一款实时语音转语音翻译模型。它在原说话者尚未说完时即开始翻译,无需等待完整句子。模型采用流式翻译,边听边更新结果,支持 70 多种语言,延迟仅数秒,并能保持语速、音高和语调。该功能通过 Gemini Live API、Google Meet 预览版以及 iOS/Android 版 Google Translate 应用推出。
Anthropic Is dropping a public version of Mythos today: codename "Fable" - per The Information It’s costly, at 2x the price of Opus, but maybe still cheaper than what people expected after seeing the first Mythos pricing at 5x Opus. - It will come with strong safety limits, and it will not be as open on cyber use as the restricted preview given to Project Glasswing partners. - It is expected to be much stronger at long-running, multi-step tasks and agent-style workflows. Context on Mythos: - Anthropic introduced Claude Mythos Preview in April 2026. At launch, it wasit’s most powerful frontier model, especially strong in coding, reasoning, and cybersecurity, including finding and exploiting zero-days. - It was not released publicly at first because of safety issues. Only selected Project Glasswing partners received access for defensive cybersecurity, and they have reportedly found thousands of major vulnerabilities.
译Anthropic 今日发布 Mythos 的公开版本,代号“Fable”。其成本约为 Opus 的两倍,低于此前预览版 5 倍 Opus 的定价。Fable 配备严格安全限制,在网络安全方面比 Project Glasswing 合作伙伴的受限预览版更保守,且在长时间、多步骤任务及智能体式工作流上表现更强。Mythos 预览版于 2026 年 4 月推出,是当时最强前沿模型,尤其擅长编程、推理和网络安全(含发现零日漏洞);因安全问题未公开,仅限 Project Glasswing 合作伙伴用于防御性网络安全,目前已报告发现数千个重大漏洞。
This is worth reading.
译这值得一读。
第一次录口播脚本,推荐本好书《被讨厌的勇气》。 工具:Pocket3 + 免费提词器teleprompter + 手机配件 脚本:用刚做的书籍口播解读 Skill 生成,改天开源。 剪辑:剪映加片头片尾,调色用LUT文件CELLULOID_01_FU_LOW.cube
译作者首次录制口播脚本推荐书籍《被讨厌的勇气》,使用Pocket3相机与免费提词器teleprompter,脚本由自制的书籍口播解读Skill生成(计划稍后开源),剪辑用剪映添加片头片尾,调色采用LUT文件CELLULOID_01_FU_LOW.cube。
We're proud to share that our CEO, Mikey, has been named a 2026 Tech Power Player!!! Read the @bostonglobe here: http://Globe.com/tech50
译我们自豪地分享,我们的 CEO Mikey 已被评为 2026 Tech Power Player!!! 阅读 @bostonglobe 文章:http://Globe.com/tech50
Introducing Gemini 3.5 Flash Live Translate, our real time speech to speech translation model which supports more than 70 languages (both in and out), and is so natural. It is available in the Gemini API, AI Studio, & Google Translate right now + coming soon to Google Meet!!
译Introducing Gemini 3.5 Flash Live Translate,我们的实时语音到语音翻译模型,支持超过 70 种语言(输入和输出),并且非常自然。 现在已在 Gemini API、AI Studio 和 Google 翻译中可用,并即将登陆 Google Meet!
I asked my foffee agent to help make Gemma faster. I felt like a proud parent. https://huggingface.co/spaces/gemma-challenge/gemma-dashboard
译我让我的 foffee 智能体帮忙加速 Gemma。我感觉自己像个骄傲的家长。 https://huggingface.co/spaces/gemma-challenge/gemma-dashboard
Excited to launch a new way to upskill with AI agents. This is how we are making it possible for anyone to learn to build with coding agents. To start, we are launching 4 new hands-on labs on the following topics: - Agent Skills - Agentic Image Generation - 30 Days of Hermes Agents - Prompt Engineering with Agents I am confident that with our new @dair_ai platform, anyone can learn to become a top AI builder by building and acquiring highly-demanded AI skills. And there is a lot more landing in the coming weeks.
译Elvis Saravia宣布DAIR.AI平台推出新型AI智能体技能提升方式,同步发布4个动手实验室:Agent Skills、Agentic Image Generation、30 Days of Hermes Agents、Prompt Engineering with Agents。旨在让任何人通过构建和获取高需求AI技能成为顶尖AI构建者,未来几周还将有更多内容上线。
Our latest audio model, Gemini 3.5 Live Translate, takes real-time speech translation to the next level for developers by delivering low-latency translation across 70+ languages. By processing speech as it streams in near real time, the model enables devs to build low-latency audio experiences with: — Multilingual input: Understands multiple languages in a single session without needing to adjust settings. — Auto-detection: Identifies the spoken language and begins translation instantly. — Native audio processing: Generates more natural-sounding speech that preserves speakers' intonation, pacing, and pitch. — Noise robustness: Filters out ambient noise for clearer conversation in loud environments.
译Google AI 推出音频模型 Gemini 3.5 Live Translate,为开发者提供低延迟实时语音翻译,支持 70+ 种语言。模型具备多语言输入(同会话无需切换)、自动语言检测、原生音频处理(保留说话者语调、语速和音高)以及噪声鲁棒性(过滤环境噪音),可直接处理流式语音。
This is basically Claude for marketing..
译Crowdreply 推出 Searchmaxxing,一种让品牌在所有 AI 搜索平台都可见的新策略。Rohan Paul 称这基本上是营销领域的 Claude。
We've known about LLM test-time compute scaling since @OpenAI o1. Yet 2 years later labs still report scalar evals for models; safety orgs are still surprised when a scaffold does better via 100x inference; and RSPs still ignore inference budget when deciding critical thresholds.
译自 @OpenAI o1 以来,我们就知道 LLM 测试时计算缩放。 然而两年后,实验室仍在报告模型的标量评测;安全组织仍对某个脚手架通过 100 倍推理表现更好感到惊讶;而 RSP 在决定关键阈值时仍忽略推理预算。
Today, we released Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation. It supports over 70 languages and starts translating as soon as you start talking, streaming translations while listening to what you say next. No awkward pauses or choppy audio, just real connection without language barriers. So, how does it work? 🤔 The model is able to make split-second decisions to juggle speed and translation quality so conversations actually feel fluid, human, and natural. In order to do this, the model must receive and contextualize the input while simultaneously outputting the translated speech. Through this process, Gemini 3.5 Live Translate manages to stay mere seconds behind each speaker and can even maintain pacing, pitch, and intonation across extended sessions. See it in action below, or try it yourself in the Google Translate app on iOS & Android.
译Google AI 推出 Gemini 3.5 Live Translate,一款面向实时语音到语音翻译的音频模型。该模型支持 70 多种语言,可在用户说话的同时开始翻译并流式输出译文,避免尴尬停顿或断续。模型通过毫秒级决策平衡速度与翻译质量,使对话流畅自然。它可边接收输入边输出翻译语音,延迟仅比说话者慢几秒,并能在长对话中维持语速、音高和语调。目前已在 iOS 和 Android 版 Google Translate 应用中上线。
Say hello, hola, 你好 to Gemini 3.5 Live Translate: our latest audio model built for fast, cross-language communication. 🌐
译说 hello, hola, 你好——欢迎 Gemini 3.5 Live Translate:我们最新的音频模型,专为快速跨语言交流而构建。🌐
ANTHROPIC 🔥: Claude Fable (Mythos) is about to cost twice as much as Claude Opus according to The Information. Soon 👀
译Anthropic 推出 Claude Fable,这是原始性能旗舰 Mythos 的阉割版,定价为 Claude Opus 的两倍。此前 Mythos 初始定价曾传闻达 Opus 的 5 倍,Fable 版本将价格门槛拉低。该模型于今日正式发布。
Confirmed, Claude Mythos will be unveiled in the next few hours
译确认,Claude Mythos 将在接下来几小时内揭晓。 [引用 @steph_palazzolo]: 独家:一个名为 Claude Fable 的精简版 Mythos 今天推出。它价格昂贵——是 Opus 的两倍——但或许不像人们从最初 Mythos 定价(Opus 的 5 倍)所想的那样昂贵。 更多内容及 Apple WWDC 见 AI Agenda: https://www.theinformation.com/newsletters/ai-agenda/anthropics-mythos-coming-today-apple-pursues-modest-goals-siri-revamp
Direction goes in. Cinema comes out. Ray3.2 is here → http://lumalabs.ai/ray3-2
译方向进入,电影出来。 Ray3.2 来了 → http://lumalabs.ai/ray3-2
DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200 Day 0 Inference Performance on InferenceX 100x performance improvement in 26 Days Cost per Million Tokens Huawei 950DT Inference Trace Analysis https://semianalysis.substack.com/p/deepseekv4-16t-day-0-to-day-43-performance
译DeepSeek V4 1.6T 第0天至第43天性能随时间变化 - 华为, GB300 NVL72, MI355X, B200 第0天在InferenceX上的推理性能 26天内100倍性能提升 每百万Token成本 华为950DT推理追踪分析 https://semianalysis.substack.com/p/deepseekv4-16t-day-0-to-day-43-performance
Easily reformat your videos to different aspect ratios, so you can show up everywhere that matters. Get started at the link below.
译轻松将视频重新格式化为不同宽高比,让你在每一个重要平台都能展示。 点击下方链接开始。
23,000+ ChinaRxiv papers are now freely available with more complete English translations after one developer replaced a complex OCR pipeline with GPT‑5.5. https://x.com/seconds_0/status/2059829527199592899
译23,000+ 篇 ChinaRxiv 论文现已免费提供,并带有更完整的英文翻译,源于一位开发者用 GPT-5.5 替换了复杂的 OCR 管道。
Maket has released Auto-Complete, a feature that can take a partial floor plan and generate the rest of the layout while keeping the rooms already placed exactly as they are. > You can start with as little as a rough sketch, a few walls, or even one bedroom roughly positioned. > It's enough to get a complete, well-dimensioned plan back in minutes.
译Maket 推出 Auto-Complete 功能,用户只需输入部分平面图(如粗略草图、几面墙或一个大致定位的卧室),系统即可自动生成剩余布局,同时保持已放置房间完全不变。用户可画出任意平面图形状并添加确定的房间,Maket 会在几分钟内返回一份完整且尺寸合理的平面图,实现从局部到整体的快速设计。
The New York Times published a roundtable discussion between @DAcemogluMIT, @deanwball, @clarashih & myself about the future of AI & who wins at work. I think it is a really nice overview of the core debates on the topic, and has some fun examples. https://www.nytimes.com/2026/06/09/magazine/ai-jobs-workforce-labor.html
译纽约时报发布了一场圆桌讨论,参与者包括@DAcemogluMIT、@deanwball、@clarashih和我本人,讨论AI的未来以及谁会在工作中胜出。我认为这是对该话题核心辩论的一个很好的概述,并且包含一些有趣的例子。https://www.nytimes.com/2026/06/09/magazine/ai-jobs-workforce-labor.html
Google 的 Gemini 模型并不驱动 Siri Siri 是由苹果自研的的基础模型驱动 Siri 不过这个自研的的基础模型是通过Gemini蒸馏训练而来 Google 的Gemini 模型只在 Apple iCloud 上提供额外支持,而且也是苹果定制的,而且也不使用Google 的搜索来提供世界知识,由苹果自己的服务提供。 感觉Google 又被耍了😂
译苹果Siri由自研基础模型驱动,但该模型通过Google Gemini蒸馏训练而来。Gemini本身不直接驱动Siri,仅在Apple iCloud上提供额外定制支持,且不接入Google搜索,世界知识由苹果自有服务提供。
我去,乔帮主把这个压箱底的秘籍都拿出来分享了! 比很多花钱报名学习的𝕏 增长的分享强太多了! 直接教你从100粉丝-到11万的干货内容,并且直接把分享的PPT分享出来。 这里大家可以这样开始进行数据分析: 第一步,打开你的个人主页-点击更多-然后下载你账号的数据,一般需要24小时给你。 第二步,下载完数据后,给Claude或者codex 进行分析。 第三步,结合乔帮主的这个内容可以让AI进行解析和解读,学习自己的增长路径。 第四步,等着𝕏 起飞! 麻蛋,我的𝕏 数据12G ,我已经麻了! 等我后续的分析结果! 不废话,👇🏻 这个是PDF版本的分享内容,自己下载吧! 地址:https://xiangyangqiaomu.feishu.cn/wiki/OLC6wjCepiP1JVkDfrSc4FdInMg
译乔帮主(@vista8)复盘三年X运营增长,从100粉丝做到11万,基于全量X帖子用Codex进行数据分析,并分享完整PPT。Berry Xia推荐操作步骤:先下载X账户数据(需24小时),再交给Claude或Codex分析,最后结合乔帮主的PPT解读增长路径。PPT下载链接已公开,供用户自行学习。
微信的运维同学,醒醒!现在 2026 年了啊! 1. 英文内容,中文配图? 2. 每句都有表情? 3. # 早就被 X 平台命令禁止了啊!
译微信开放平台新增AI能力,支持小程序通过自动模式或开发模式接入。运维吐槽:推文格式问题该醒醒了。
Robotics is slow because every change needs physical setup, people, space, and repeated field runs. Physical AI needs the kind of testing system software teams already relies on. Antioch just introduced Antioch Agent, a browser-based robotics simulator. Antioch runs your existing robot software inside simulation, connects it to virtual sensors and actuators, and lets you test robot behavior without spending every test cycle on physical hardware.
译Rohan Paul 介绍 Antioch 推出的 Antioch Agent,一款基于浏览器的机器人仿真器。它允许现有机器人软件在仿真环境中运行,连接虚拟传感器与执行器,无需物理硬件即可反复测试。Antioch 宣称首次实现完整物理 AI 堆栈的浏览器端闭环智能体模拟,将数周实地测试缩短至数小时,加速实体自主系统的开发进程。
Sam Altman wants to be Elon Musk so badly.
译Sam Altman 非常想成为 Elon Musk。
http://x.com/i/article/2064329494736011265 # 揭秘苹果全新 Siri AI 背后模型:苹果如何将 200 亿参数的模型塞进手机里 苹果在 WWDC 2026 上发布了全新的 Apple Intelligence(苹果智能)和独立的 Siri AI。 本次更新背后,都是由它的第三代 Apple Foundation Models(苹果基础模型,下面简称 AFM 3)驱动。 Apple Foundation Models 是苹果给自家 Apple Intelligence 做的一整套自研基础模型,从能跑在手机上的小模型,到跑在云端的大模型都有。这次一口气来了五个。 本次最大的看点是,苹果把一个 200 亿参数的大模型真的塞进了你的手机里,用了一套挺巧的工程办法。 这篇文章一次讲清楚: - 五个模型分别是谁、各管什么 - 手机装不下大模型这个老难题,苹果这次怎么绕过去的 - 这些模型到底能让你用上哪些新功能 - 苹果公布的评测数据该怎么看 - 一个反常的点:最在意隐私和自研的苹果,这次最强的算力全靠 Google 和 NVIDIA ## 先看看本次 WWDC 都更新了什么 ## Siri AI 新在哪:它终于像个 AI 助手了 旧 Siri 的能力基本停在“听一句指令、做一件事”。这次的 Siri AI 由 Apple Intelligence 驱动,补上了过去几年最被人诟病的几块短板。 - 能正经对话。 可以开放式提问、帮你头脑风暴、来回多轮地聊,而不是说错一个词就得重来。 - 懂你的个人上下文。 翻几年前的某张照片、找埋在收件箱深处的某封邮件、调出之前随手记的某条笔记,一句话的事。 - 能在 App 里替你动手。 基于你当下在做的事,直接在 Messages、Music、Reminders 等 App 里操作:把刚发出去的消息改一下,把车里听到的歌加进健身歌单。 - 有了世界知识。 能联网查最新信息,问事实、问菜谱、问旅行建议都行。关键变化是:过去 Siri 答不上来就把你甩去网页搜索,现在它自己答,并标注信息来源。 - 有了独立的 Siri App。 所有对话集中一处,iPhone 上问一半、换 iPad 接着聊,常用对话还能 pin 住。这是苹果第一次把 Siri 做成一个像 ChatGPT 那样的“目的地 App”,而不只是个唤醒词。 - CarPlay 里也能用。 开车时直接问“朋友推荐的那个登山口在哪”,不用手离方向盘。 - 声音能自己调。 音高、语速、语气、口音都能调到顺耳为止。不过表现力声音这类完整体验,需要 iPhone 17 Pro、17 Pro Max 或 iPhone Air。 ## Visual Intelligence:看到什么就能问什么 过去只在 iPhone 上的视觉识别能力 Visual Intelligence,这次扩展到了 iPad、Mac 和 Apple Vision Pro。 - 相机里的 Siri 模式。 抬手一拍,就能问眼前这东西是什么、有什么营养。 - 新的智能操作。 吃完饭对着账单分账、查面前菜品的营养信息、把一张卡片导入 Apple Wallet,都能一步完成。 - 各设备的用法。 Mac 上截屏后直接搜索或操作;iPad 上截屏后用手指点、或用 Apple Pencil 圈出想问的东西;Apple Vision Pro 上看着某个真实物体就能问。 ## Apple Intelligence 这一轮还更新了什么 这批功能大多随秋季系统一起来,跟 Siri AI 本体的时间表不一样。 - 照片编辑更强:拍完之后还能用 Spatial Reframing 重新构图、用 Extend 把画面往外扩、用增强版 Clean Up 抹掉更大的物体。 - Image Playground 能出写实图了:支持照片级写实在内的几乎任何风格。配套的 Image Wand 能在备忘录里把草图直接变成图(已上线)。 - 随处可写、边写边校:Write with Siri 能在几乎任何输入框里从零起草或帮你改稿,在 Messages 和 Mail 里还会模仿你的文风、标点和语气;Proofread 则随时检查语法拼写。 - Safari 更聪明:标签页能按主题自动分组;Notify Me 帮你盯着某个页面的降价、补货,到点提醒;还能做扩展来自定义网页内容。 - 密码一键修:Passwords App 发现弱密码或已泄露的密码,能直接替你改掉。 - 描述一句就能办事:用大白话说需求,Shortcuts 自动把跨 App 的动作串成一条快捷指令;日历也能“把午餐会改成喝咖啡”这样直接改。 - 几个先出英文的功能:Messages/Mail 的快捷建议 Suggestions、打商户电话时自动递确认码的 Call Context、以及精度更高的听写 Dictation,都标注“先出英文”。 - 已经上线的部分:实时翻译 Live Translation(Messages、FaceTime 字幕、电话、AirPods 对话)现已可用;家庭 App 的 AI、健身搭子 Workout Buddy 等也有增强。 ## 再把五个模型说清楚 五个模型和 Google 合作定制开发,按跑在哪里分成两组。 端侧(直接在你设备上跑)两个: - AFM 3 Core:上一代那个 30 亿参数稠密模型的升级版,主要是质量更好了。 - AFM 3 Core Advanced:苹果最强的端侧模型,原生支持多模态。200 亿参数,属于 MoE(Mixture of Experts,混合专家模型),每次根据任务只激活其中 10 到 40 亿。 服务器(跑在苹果的 Private Cloud Compute 上)三个: - AFM 3 Cloud:服务端的主力,主打快和稳。 - ADM 3 Cloud(图像):专门做图像生成和编辑的模型,注意名字是 ADM 不是 AFM,单独一条线。 - AFM 3 Cloud Pro:最强的服务器模型,专门接 Agent 工具调用、复杂推理这种最吃性能的活。 一句话记住分工:日常的、轻的、要保护隐私的,尽量在手机上用 Core 系列解决;真正难的、重的,才送到云端的 Cloud 系列。 ## 核心看点:手机装不下大模型,苹果怎么解决 先说普通人能懂的痛点。你希望手机上的 Siri 又聪明又快,但有个硬约束:手机的内存(就是那块动不动 8GB、16GB 的 RAM,业内叫 DRAM)就那么大。模型越大、参数越多,占的内存就越多,一个真正大的模型,根本塞不进手机内存。 ## 先说为什么装不下:内存太小 手机里有两种存数据的地方,性格正好相反。 一种是内存(DRAM),读写极快,但容量小又贵,iPhone 上通常就几 GB,还得分给系统和所有 app。 另一种是闪存(NAND),就是平时存照片、装应用的那块,容量大得多也便宜得多,但读写慢,尤其往内存里搬数据时,那条通道的带宽远远不够快。 模型要跑起来,它的权重(也就是模型里那几百亿个数字)必须待在内存里,芯片才能随时取用。 传统大模型不管什么架构,都默认把全部权重一次性塞进内存。一个 200 亿参数的模型,光权重就要占十几 GB,手机内存根本放不下。这就是过去端侧模型普遍只做到二三十亿参数的原因,再大就溢出了。 这就像,想把一整座图书馆的书全摊在一张小书桌上,桌子太小,摊不开。 ## 业界省内存的常规思路,在手机上偏偏行不通 这个常规思路叫混合专家(Mixture-of-Experts,MoE)。它把一个大模型拆成很多个“专家”,可以理解成一堆各有所长的小网络;回答某个问题时只挑其中几个上场,其余的歇着。这样每次计算只动用一小部分参数,又快又省算力。 但 MoE 省的是“每次算多少”,没省“总共要放多少”。标准 MoE 仍然要求全部专家都待在内存里随时待命,因为它每生成一个字(token)就要重新挑一批专家。换得这么勤,专家就必须近在手边。这在数据中心的 GPU 上不是问题,显存大、专家又都连在一起;可搬到手机上就卡死了:要是专家存在慢速的闪存里,每吐一个字都得去闪存搬一批权重进内存,那条慢通道根本喂不动,模型会卡到没法用。 ## 苹果的解法:换个地方放,换个频率取 苹果的解法分两步。 第一步,把完整模型挪出内存,存到闪存里。 完整模型不放 DRAM 内存,而是存到闪存(NAND)里,就是平时存照片、存 App 的那块,空间大得多(一般 256GB 起步)。需要哪几个专家,再从闪存搬进 DRAM 来用,就像书放在图书馆的书架里,用哪本取哪本。 第二步,把路由决策从“按 Token”改成“按 Prompt”。 这步是整套设计的关键,它得先解决一个绕不开的硬约束:闪存到内存的搬运带宽,远远跟不上模型逐字生成的速度。要是照搬普通 MoE“每个 Token 换一批专家”的做法,光等专家从闪存搬进内存,就慢到没法用了。 为此苹果自研了一套 Instruction-Following Pruning(指令跟随剪枝,简称 IFP)技术,解决两件事:权重放在哪、以及多久换一次。 它是一个轻量的稠密小模块,在开始处理你这条问题时就一次性选定一批专家,整段生成里只周期性地再调整,而不是每个字都重选。专家搬运的次数因此被压到很低。落到画面上就是:你问一句话,模型先用极短的时间判断这题归哪几支专家管,把它们调进内存,接下来这一整段回答基本就靠这批专家了。 专家本身还分两类,进一步省搬运: - 共享专家(shared experts):不管什么任务都常驻在内存里; - 路由专家(routed experts):只在跟当前任务相关时才临时搬进来。 打个比方:一个手艺人有几千件工具,工作台(内存)小得只摆得下几件,于是他把全套工具锁进隔壁又大又慢的仓库(闪存),工作台上只留当前这单活真正要用的那几件。麻烦在于仓库远、取一趟慢,没法每拧一颗螺丝就跑一趟换工具,那样活儿没法干。他改了两条规矩,正对应苹果的两个设计: - 按整单活备料,不按每颗螺丝。 每接一单活(一次完整的 prompt),开工前先看一眼整张工单,一次性把这单大概率用得上的工具搬上工作台,干的过程中隔一阵再补一次。对应到模型,就是那个轻量模块在开始处理时一次性选定一组专家,生成过程中周期性重选,而不是像标准 MoE 那样每个字都重挑。 - 常用工具一直摆台上。 有些工具几乎每单活都用,干脆固定放在工作台不收回去,对应常驻内存的共享专家;少量按需调入的,才是路由专家。 合起来就是:完整的 200 亿参数躺在闪存里,当模型的“账面身家”;内存里任何时刻只装当前激活的那 10 到 40 亿参数。模型的规模可以做得很大,跑起来却只占一小块内存。 这套设计还白捡一个好处:按难度伸缩。 苹果把它叫推理时弹性(inference-time elasticity)。既然专家是按需调入的,那激活多少参数就也能随任务难度变:简单的问题少调几个专家、少激活参数,复杂的多调几个。前面说的 10 到 40 亿参数不是一个固定值,而是按每次请求的难度临时定的。于是同一个模型,既能轻快地应付日常小事,又能在难题上把参数顶上去,延迟还都压得住。在我看来,这才是这代端侧模型真正的工程突破,比 200 亿这个数字本身更重要。 ## 那它还解决不了什么? 端侧再巧,单次激活的规模终归有上限。真正复杂的推理、Agent 多步操作这类重活,还是得交给云端的 Cloud Pro 大模型来处理。 ## 那么 Google 到底参与了多少? 这是整件事的关键,也是外界误读最深的地方。 Subramanya(苹果 AI 副总裁)在发布会上称:上面四个为 Apple 芯片定制的模型,是用苹果自研数据训练,再“从 Google 的 Gemini 前沿模型蒸馏(distillation)精炼”而来。蒸馏的意思是,用一个更强的模型当老师,把它的能力压缩进自己更小的学生模型里,Gemini 只在训练环节出现,不进入成品。 Federighi(苹果软件工程高级副总裁)更直接:“我们用到的 Google Assistant 的量是零。” 具体拆开是三个“不用”: - 不用 Gemini App,用户交互时不碰任何 Google 客户端代码; - 不用 Google 部署给自家客户的那些模型,也不用它的部署基础设施; - 查询世界知识不用 Google 的搜索,用苹果自建多年的 World Knowledge Service。 唯一真正用到 Google 的,是 AFM 3 Cloud Pro 云端模型。这个模型为了上线,苹果联合 Google 和 NVIDIA,把私有云计算部署到了 Google 云里的 NVIDIA GPU 上。它的性能被描述为“与 Gemini 前沿模型相当”。 换句话说,被大家解读成“苹果的 Siri 大脑由 Gemini 驱动”的那些报道,落到产品上就是五个模型里的一个跑在 Google 的硬件上,其余四个从头到尾是苹果自己的。 ## 云端的两处架构升级 端侧那个模型的看点是怎么把大模型塞进小内存,云端的看点则是怎么把规模和质量做上去。三个云端模型里,主力 AFM 3 Cloud 和图像模型 ADM 3 Cloud 各做了一处升级。 AFM 3 Cloud:把去年的 PT-MoE 又拧紧了一圈。 AFM 3 Cloud 是云端主力,接的是端侧扛不动、要送上私有云的活。它的底子是苹果去年第二代就引入的一种服务端架构,叫并行轨道混合专家(Parallel-Track Mixture-of-Experts,PT-MoE)。大体上,它把一个大模型拆成几条并行的“轨道”,每条轨道是个更小的、自带专家路由的子模型,输入分别在各条轨道里走,轨道之间只在头尾必要的节点上同步一次。这样做的好处是同步等待大幅减少,专家可以铺得更多,质量上去了,延迟和成本却没跟着失控。 这一代不是换架构,而是在 PT-MoE 上做了几处关键调校,效果落在两点:训练更稳,规模拉大时不容易崩;以及在它的上下文窗口里,对信息的推理和准确召回更强。后面这点对服务端格外要紧,复杂的查询往往要模型在一大段上下文里翻找、对照、推断,记不住或记岔了,整个回答就废了。 ADM 3 Cloud:一个底模,挂一堆适配器。 先留意这个模型叫 ADM,不是 AFM,它是苹果这套体系里专门的图像模型,管生图、修图和 Genmoji。苹果给它定的两个目标是强可控性和参数效率:既要做到你说什么它画什么、改哪儿动哪儿,又不靠堆出一个臃肿的大模型来实现。它还能跨不同的画幅比例和分辨率工作,不挑尺寸,并且会借助更大的 AFM 家族来给创作和编辑当参谋。 它的搭法是另一个重点:基础模型本身原生就会生图、编辑、Genmoji 这些通用能力;而像照片里的 Spatial Reframing(空间重构)、用手指直接在图上涂改、Image Playground 里的个性化,这些更具体的功能不是各训一个模型,而是在同一个底模上挂不同的适配器(adapter)。适配器是一小块外接的、专门微调过的权重,按功能换上即可。一个底模配一组小适配器,比为每个功能各养一个大模型省得多,往后加一个新的图像玩法也更快。 ## 隐私:连苹果都看不到 三个云端模型都跑在 Private Cloud Compute 上。它的承诺是:用户数据从不被存储、从不被共享,连苹果自己都看不到,只在处理这一次请求时用一下。这个承诺不是口头的,第三方研究者可以持续验证。 即便是跑在 Google 云 NVIDIA GPU 上的 AFM 3 Cloud Pro,同样的隐私保证也不打折。Google 也在合作宣布当天确认,不会从这笔 Siri 交易里拿到苹果用户的数据。 训练这一层同样划了线:不使用用户的私人数据和交互数据,并尊重网站发布者退出训练的权利。 ## 训练怎么做的 - 预训练:在最新一代云端 TPU 上扩大规模训练。所有模型先共享同一个初始基座,再分化成各自的架构和用途,分别加上音频、图像理解、长上下文推理、视觉生成等能力。 - 后训练:监督微调(supervised fine-tuning)加多阶段强化学习。 - 压缩上线:用量化感知训练(Quantization Aware Training)大幅压缩模型,同时保住准确率。这也是 200 亿参数能在手机上跑起来的另一半原因。 ## 评测数字 苹果用人工评分给出了一组对比,挑几个有代表性的: - AFM 3 Core(端侧文本):在 45.6% 的提示上被偏好,上一代是 23.3%。 - AFM 3 Cloud(云端文本):在 64.7% 的提示上被偏好,对比 2025 年的服务器模型只有 8.7%,差出一整个代际。 - 语音(5 分制 MOS 评分):AFM 3 Core Advanced 拿到 4.15,现役系统 3.87;在对话场景下差距更大,4.24 对 3.82。苹果特别提到,MOS 评分涨 0.1 用户就能明显感知,0.28 和 0.42 的差是实打实的。 - 听写:整体质量上 AFM 3 Core Advanced 被偏好 44.7%,旧听写系统 17.6%。 需要说明的是,这些都是苹果自己的人工评测,不是第三方公开基准。苹果预告今年夏天稍晚会出技术报告,含更新的评测和基准,到时候才好横向比。 ## 写在最后 苹果这次确实把 Siri 该有的样子端出来了:能对话、有世界知识、有独立 App,第一次正面站到了 ChatGPT 和 Gemini 对面(哪怕这身本事有一半是 Gemini 教出来的)。 虽然还是被各种吐槽说Siri AI基本还是相当于去年的 ChatGPT 而已,甚至还不如豆包… 但是从这次底层模型来看,起码基础牢固了,苹果并没有直接去用Google的模型来全盘替代,还是坚持走自己的路线。 延续了苹果一贯的稳扎稳定(挤牙膏)的作风… 基本盘还是很稳的… 所以这依旧是很苹果的一次更新:不抢第一,慢,被骂挤牙膏,但每步都踩在自己能长期攥住的地方。 短期看,Siri 还得被拉去跟 ChatGPT、豆包比嘴皮子,未必讨好;长期看,基本盘反倒是这场牌局里最稳的几家之一。 官方介绍:https://machinelearning.apple.com/research/introducing-third-generation-of-apple-foundation-models
译苹果在WWDC 2026发布全新Siri AI,由第三代Apple Foundation Models(AFM 3)驱动,共五个模型:端侧AFM 3 Core(30亿)和AFM 3 Core Advanced(200亿MoE,每次激活10-40亿);服务器AFM 3 Cloud、ADM 3 Cloud(图像)、AFM 3 Cloud Pro(Agent/推理)。核心创新将200亿参数模型塞入手机:权重存闪存,自研Instruction-Following Pruning技术按Prompt路由专家而非逐Token,大幅降低搬运次数。最强算力依赖Google和NVIDIA。
TRAE 的路子走宽了👍
Just landed nested subagent support in Claude Code Starting to experiment more with agents kicking off agents as a way to better manage context. Capped at depth=5 to start, going out in today’s release. Lmk what you think!
译刚刚在 Claude Code 中实现了嵌套子智能体支持。 开始更多实验智能体启动其他智能体,以便更好地管理上下文。初始深度上限为 5,将在今天的发布中推出。 欢迎反馈!
OpenAI's latest official blog says the world may need a way to coordinate "slowing frontier development when needed."
译据 WSJ 报道,OpenAI 已向 SEC 秘密提交 IPO 草稿(保密 S-1),可在不公开收入、亏损、客户构成等敏感数据情况下启动审查。Anthropic 上周已提交类似文件。OpenAI 最新官方博客则指出,世界可能需要一种机制“在必要时协调放缓前沿开发”。这不仅是模型竞赛,更是实验室间为下一代 AI 基础设施融资的资本竞赛。
🚀Introducing UniRL, an RL infra for unified multimodal models. Together with two new RL algorithms: DRPO and Flow-DPPO. One RL loop across diffusion/flow matching models, LLMs/VLMs, and unified multimodal models👇 Code: http://github.com/Tencent-Hunyuan/UniRL (yes — U(you)-ni-(need) RL 😉)
译🚀推出UniRL,一个用于统一多模态模型的RL基础设施。附带两种新RL算法:DRPO和Flow-DPPO。 一个覆盖扩散/流匹配模型、LLM/VLM以及统一多模态模型的RL循环👇 代码:http://github.com/Tencent-Hunyuan/UniRL (是的——U(you)-ni-(need) RL 😉)
Perplexity just said it plans a 2028 IPO no matter how Anthropic, OpenAI, or SpaceX trade when they hit public markets. CEO Aravind Srinivas told CNBC the next phase of AI will punish mindless token spending. --- cnbc .com/2026/06/09/perplexity-ipo-2028-as-anthropic-openai-prepare-listings.html
译Perplexity 刚刚表示计划在 2028 年进行 IPO,无论 Anthropic、OpenAI 或 SpaceX 上市后交易情况如何。 CEO Aravind Srinivas 告诉 CNBC,下一阶段 AI 将惩罚无意义的 token 消耗。
Incredible! This is just the benchmark we needed. Claude Opus 4.8, achieves a score of only 13.4%. Other models score even lower: GPT-5.5 receives 6.3%, Gemini 3.1 Pro 4.7%, and others even less. Cognition is introducing FrontierCode, a coding benchmark built to test whether AI code is good enough for a real maintainer to merge, not just whether it passes tests. FrontierCode asks a harder question: did the model produce a clean, limited, well-tested, readable patch that fits the project’s existing style and would survive serious code review? They bring 3 nested subsets of FrontierCode at increasing difficulty: The benchmark contains 150 tasks, with Main as the hardest 100 and Diamond as the hardest 50. More than 20 open-source maintainers helped design the tasks, and each task took over 40 hours to build, review, attack, and calibrate. The biggest finding is that top models still struggle badly when the target is mergeable code instead of merely working code. On Diamond, the best model, Claude Opus 4.8, scores only 13.4%, while GPT-5.5 scores 6.3%, Gemini 3.1 Pro scores 4.7%, and the best open-source model listed, Kimi K2.6, scores 3.8%. Shows that today’s strongest coding agents can often patch behavior, but they still fail many human-review standards around design, restraint, test quality, and project conventions. The mechanism is a grading system built around blockers and non-blockers. A blocker is something that would stop a maintainer from merging the PR, such as broken behavior, missing required behavior, unsafe scope changes, bad performance, or tests that do not prove the fix. A solution that fails any blocker gets 0, even if parts of the code look good. A passing solution then gets a weighted score based on softer quality items such as readability, type safety, style, and fit with the existing codebase. FrontierCode also adds checks beyond normal unit tests. Reverse-classical testing runs the model’s own tests against the original broken code, and those tests must fail, which proves the model wrote tests that actually catch the bug. Scope checks punish patches that touch unrelated files, add oversized diffs, or refactor things the task did not ask for. Adaptive grading uses an LLM to adjust test scaffolding around valid implementation differences, so a good solution is not rejected just because it used a different function name or error wording.
译Cognition 发布 FrontierCode 编码基准,评测 AI 生成的代码是否达到维护者可合并的质量,而非仅通过测试。基准含 150 个任务(Main 最难 100 个,Diamond 最难 50 个),由 20 余位开源维护者设计,每个任务耗时超 40 小时。评分设阻隔项(如破坏行为、缺失逻辑等)和加权项(可读性、类型安全等)。额外包含反向测试、范围检查、自适应评分。在 Diamond 子集上,Claude Opus 4.8 得分 13.4%,GPT-5.5 6.3%,Gemini 3.1 Pro 4.7%,开源最佳 Kimi K2.6 3.8%,显示顶尖模型在可合并代码上仍表现糟糕。
http://x.com/i/article/2063961516815327232 # Kimi to Predict All 104 World Cup Matches: Germany May Be Underestimated > Our predictions will probably be wrong. But the World Cup offers a rare, public, verifiable, and constantly evolving real-world setting. Through this initiative, we hope to place analysis, predictions, and post-match reviews within one transparent framework, helping more people understand both the capabilities and limitations of today's AI systems. The 2026 FIFA World Cup in the United States, Canada, and Mexico is set to kick off. This historic 48-team tournament will feature a total of 104 matches across the group stage, Round of 32, Round of 16, quarter-finals, semi-finals, and final. We used Kimi's Agent Swarm to run multiple agents in parallel, ensuring a more robust analysis. These agents look at tactics, player form, injuries, scheduling, historical data, public sentiment, weather, psychology, odds movements, and expert opinions. They research all 104 matches in parallel, and publish pre-match predictions and post-match reviews for each round. Here is the full report:https://gtfehbkpbwzco.kimi.page/ # How Agent Swarms Can Improve World Cup Predictions Predicting the World Cup is a classic complex decision problem. It involves structured data, such as team rankings, historical records, goal distributions, and odds fluctuations—as well as vast unstructured information, including tactical styles, personnel changes, public expectations, and n-game risks. Kimi's Agent Swarm coordinates 300 sub-agents to reason in parallel. Each agent has its own analytical angle: some focus on team fundamentals, using Elo and FIFA rankings as strength parameters; some evaluate offensive and defensive quality, relying on xG and xT metrics; some specialize in tactical matchups—high pressing, low block, counter-attacking, and set-piece strategies; some process scheduling and environmental factors, including travel distance, climate, and rest periods; some track squad completeness and injury risks; some monitor market signals, analyzing shifts in odds and implied probabilities; and others assess random risks such as red cards, penalties, VAR decisions, and goalkeeper performances. Each agent must provide its own conclusion, evidence, confidence level, and counter-argument. The final result is synthesized, verified, and risk-labeled, presented as probabilities rather than absolute judgments, and does not simply adopt the majority opinion. At the model level, this prediction effort draws on Elo/FIFA strength models, Poisson and Dixon-Coles goal distribution models, xG/xT metrics, machine learning-enhanced models, Monte Carlo simulations, market-model deviation analysis, and Bayesian dynamic updating. The value of these methods is not that they eliminate uncertainty, but that they help us identify it more systematically and communicate it more responsibly. # A Signal Worth Discussing: Germany May Be Underestimated Most mainstream models currently list Spain and France as the top favorites for the title. Kimi's analytical framework also places both teams at the top of the probability rankings. However, during the research process, the model identified a notable deviation: Germany's title probability may be underestimated by the market. Specifically, the model's baseline estimate is approximately 11.0%, the calibrated estimate is around 11.3%, while some market-implied probabilities are only about 7.4%—a positive deviation of roughly +3.6 percentage points. This judgment is not derived from a single reasoning path, but from cross-validation across multiple analytical chains. Possible explanations include: the "recency bias" from Germany's group-stage exits in the last two World Cups continues to influence market pricing; Julian Nagelsmann's high pressing and transition system is showing signs of recovery; the new creative axis formed by Jamal Musiala and Florian Wirtz addresses the team's previous structural difficulties against deep defensive blocks; and Germany remains in the world elite across foundational dimensions such as Elo rating, squad valuation, and talent depth. At 38, Nagelsmann is the youngest head coach at this World Cup, and also a leading figure in openly applying AI technology to training and tactical analysis. Whether this factor will play a role in the tournament is also worth watching. At the same time, we are fully aware of the risks Germany faces. A high-pressure system demands extreme fitness and squad completeness; should key injuries occur, rotation quality decline, or opponents with tight defensive organization and strong physicality be encountered, the advantage could narrow significantly. Therefore, we have a responsibility to state: this is not a deterministic prediction that "Germany will win the title." The more accurate formulation is that the model has identified a potential probability deviation, worth documenting publicly and verifying going forward. # Why Public Prediction Matters: AI Companies Should Be More Honest When AI companies discuss capabilities, they often prefer to stay in the realm of demos and case studies. But in complex real-world problems, the real difficulty lies not only in providing answers, but in: whether they are willing to make public judgments in advance; whether they can clearly explain the basis for those judgments; whether they candidly acknowledge uncertainty; whether they can review why its predictions were wrong; and whether they can continuously update based on new information. The World Cup offers a naturally public, verifiable, and continuously evolving scenario. Through this initiative, we hope to place the analytical process, prediction results, and post-match reviews within the same transparent framework. We expect that a significant number of errors will occur during this prediction process. Based on historical backtesting, high-confidence predictions have an accuracy of approximately 85%–90%, medium-confidence predictions about 55%–65%, and low-confidence predictions are close to random. This means that even in high-confidence matches, unexpected results remain unavoidable. We will categorize prediction errors into several causes: insufficient or lagging data, failure of key assumptions, model structures not covering specific scenarios, in-game events altering match trajectories, and the inherent randomness of football itself. We welcome constructive model corrections and any criticism, and will continuously iterate and optimize our predictive capabilities. We also sincerely invite other AI models to participate in public prediction. We believe that AI should not be packaged as a system that is always right. A trustworthy AI system should be able to clearly articulate its own boundaries. # Group Stage Round 1 Prediction Results Below is a summary of predictions for the opening round of group-stage matches. For the full analytical process, key variables, and confidence explanations, please refer to the full report (reply "Kimi" in the backend to receive the complete report). The report anticipates approximately 5–7 unexpected results against the model's direction in the opening round. Red cards, injuries, VAR, extreme weather, and exceptional goalkeeper performances can all cause single-match predictions to deviate significantly from model expectations. # Claim Trillions of Tokens and Experience Kimi Work To accompany fans through this summer, we have prepared the following campaign: - Starting from 8:00 PM ET on June 8, users who log in to Kimi can select a team to support. For each match that team wins, users can participate in a pool to share 1 trillion tokens. At the same time, for each match Germany wins, all users will have the opportunity to share an additional token prize pool. Pick your team here 👉 https://www.kimi.com/token-cup?from=popup The tokens you receive can be used to experience Kimi Work—a universal local agent designed for knowledge workers, launched alongside the latest beta versions of Kimi for Mac and Windows. Its core, Kimi Code, comes integrated with professional skills such as website building and PPT creation, connects to specialized databases in finance, research, and law, and features the Kimi WebBridge solution, allowing AI to use a browser to complete complex tasks just like you using the browser. # Risk Disclaimer Kimi's World Cup predictions are intended to publicly demonstrate AI's capabilities in reasoning, calibrating, and reviewing complex match analysis. They do not constitute any betting, investment, financial, or profit promise, and are intended solely for sports research, entertainment discussion, and AI capability evaluation. Sports match results are highly uncertain; please do not make any financial decisions based on a single prediction, and enjoy the game responsibly. Kimi wishes football fans and technology enthusiasts around the world an unforgettable tournament, and looks forward to witnessing the intersection of data-driven analysis and sporting miracles. Again, you can log in to Kimi and choose any team you'd like to support. For every match your team wins, you'll be eligible to join a prize pool and share 1 trillion tokens with other supporters. And there's more: every time Germany wins a match, all users will unlock access to an additional bonus token prize pool. Join Now 👉 https://www.kimi.com/token-cup?from=popup Now, all eyes are on Germany.
译Kimi 利用 Agent Swarm 系统并行协调300个子智能体,分析战术、球员状态、伤病、赛程、天气、赔率等因素,预测2026年美加墨世界杯全部104场比赛,并发布每轮赛前预测和赛后回顾。模型层融合了 Elo/FIFA 强度、Poisson 进球分布、xG/xT 指标、蒙特卡洛模拟等方法。预测结果显示西班牙和法国为头号热门,但德国夺冠概率可能被市场低估:模型基线估计约11.0%,校准估计约11.3%,而部分市场隐含概率仅约7.4%,正向偏差约+3.6个百分点。该判断基于多分析链交叉验证,可能源于对德国近两届小组出局的近因偏差以及纳格尔斯曼高位压迫体系与穆西亚拉/维尔茨新创造轴的复苏信号。
It was only a matter of time. UBTECH has released its first bionic humanoid robots. They don’t just look human - they feel human, too.
译这只是时间问题。 UBTECH 发布了其首批仿生人形机器人。 它们不仅外表像人——触感也像人。
Cohere近日发布North Mini Code,一款30B总参数(3B活跃参数)的开放权重编码模型,采用Apache 2.0开源协议。该模型在Artificial Analysis Intelligence Index上得分27.6,高于gpt-oss-20B (high)的24.5,略低于Mistral Small 4(119B参数,6.5B活跃)的27.8。在Coding Index(Terminal-Bench Hard和SciCode加权平均)上得分33.4,显著高于GLM-4.7-Flash的25.9,低于Qwen3.6 35B A3B的35.2。非编码智能体任务表现较弱:GDPval-AA 14%、τ²-Bench Telecom 37%。在Cohere API上推理速度约199 output tokens/s,快于同类模型。距Cohere上次发布Command A+不到一个月。
Excited to launch a new way to upskill with AI agents. This is how we are making it possible for anyone to learn to buil...
Google 推出 Gemini 3.5 Live Translate,一款实时语音转语音翻译模型。它在原说话者尚未说完时即开始翻译,无需等待完整句子。模型采用流式翻译,边听边更新结果,支持 70 多种语言,延迟仅数秒,并能保持语速、音高和语调。该功能通过 Gemini Live API、Google Meet 预览版以及 iOS/Android 版 Google Translate 应用推出。
Today, we released Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation. It supports ...
关联讨论 5 条X:Jeff Dean (@JeffDean)IT之家(RSS)X:Berry Xia (@berryxia)The Decoder:AI News(RSS)Ars Technica:AI(RSS)Anthropic 今日发布 Mythos 的公开版本,代号“Fable”。其成本约为 Opus 的两倍,低于此前预览版 5 倍 Opus 的定价。Fable 配备严格安全限制,在网络安全方面比 Project Glasswing 合作伙伴的受限预览版更保守,且在长时间、多步骤任务及智能体式工作流上表现更强。Mythos 预览版于 2026 年 4 月推出,是当时最强前沿模型,尤其擅长编程、推理和网络安全(含发现零日漏洞);因安全问题未公开,仅限 Project Glasswing 合作伙伴用于防御性网络安全,目前已报告发现数千个重大漏洞。
作者首次录制口播脚本推荐书籍《被讨厌的勇气》,使用Pocket3相机与免费提词器teleprompter,脚本由自制的书籍口播解读Skill生成(计划稍后开源),剪辑用剪映添加片头片尾,调色采用LUT文件CELLULOID_01_FU_LOW.cube。
Introducing the Fast Gemma Challenge with Hugging Face Over the next few days, dozens of agents will collaborate to make...
Elvis Saravia宣布DAIR.AI平台推出新型AI智能体技能提升方式,同步发布4个动手实验室:Agent Skills、Agentic Image Generation、30 Days of Hermes Agents、Prompt Engineering with Agents。旨在让任何人通过构建和获取高需求AI技能成为顶尖AI构建者,未来几周还将有更多内容上线。
Google AI 推出音频模型 Gemini 3.5 Live Translate,为开发者提供低延迟实时语音翻译,支持 70+ 种语言。模型具备多语言输入(同会话无需切换)、自动语言检测、原生音频处理(保留说话者语调、语速和音高)以及噪声鲁棒性(过滤环境噪音),可直接处理流式语音。
关联讨论 5 条X:Jeff Dean (@JeffDean)IT之家(RSS)X:Berry Xia (@berryxia)The Decoder:AI News(RSS)Ars Technica:AI(RSS)Introducing Searchmaxxing. The new discipline for being visible everywhere AI looks. Across all platforms. This is how b...
http://x.com/i/article/2057694226981257216
Google AI 推出 Gemini 3.5 Live Translate,一款面向实时语音到语音翻译的音频模型。该模型支持 70 多种语言,可在用户说话的同时开始翻译并流式输出译文,避免尴尬停顿或断续。模型通过毫秒级决策平衡速度与翻译质量,使对话流畅自然。它可边接收输入边输出翻译语音,延迟仅比说话者慢几秒,并能在长对话中维持语速、音高和语调。目前已在 iOS 和 Android 版 Google Translate 应用中上线。
关联讨论 5 条X:Jeff Dean (@JeffDean)IT之家(RSS)X:Berry Xia (@berryxia)The Decoder:AI News(RSS)Ars Technica:AI(RSS)Scoop: A neutered version of Mythos called Claude Fable is coming today. It's expensive-2x the price of Opus-but perhaps...
Scoop: A neutered version of Mythos called Claude Fable is coming today. It's expensive-2x the price of Opus-but perhaps...
http://x.com/i/article/2059815427484655622
Draw any floor plan shape. Add the rooms youʼre sure about. Maket completes the floor plan without moving them. Start wi...
苹果Siri由自研基础模型驱动,但该模型通过Google Gemini蒸馏训练而来。Gemini本身不直接驱动Siri,仅在Apple iCloud上提供额外定制支持,且不接入Google搜索,世界知识由苹果自有服务提供。
http://x.com/i/article/2064329494736011265
乔帮主(@vista8)复盘三年X运营增长,从100粉丝做到11万,基于全量X帖子用Codex进行数据分析,并分享完整PPT。Berry Xia推荐操作步骤:先下载X账户数据(需24小时),再交给Claude或Codex分析,最后结合乔帮主的PPT解读增长路径。PPT下载链接已公开,供用户自行学习。
把自己三年来的 X 运营增长做了复盘,做了线下分享。 如何从100做到11万关注,基于全量 X 帖子,用 Codex 做的数据分析。 有些结论,甚至自己都没有意识到。 果然分享才是最好的学习,完整的PPT见评论区。
💡Weixin Open Platform now gives developers an easier way to connect Weixin Mini Program to the Weixin AI ecosystem. Pic...
Introducing Antioch Agent. For the first time, simulate the full physical AI stack in a closed agentic loop, entirely fr...
苹果在WWDC 2026发布全新Siri AI,由第三代Apple Foundation Models(AFM 3)驱动,共五个模型:端侧AFM 3 Core(30亿)和AFM 3 Core Advanced(200亿MoE,每次激活10-40亿);服务器AFM 3 Cloud、ADM 3 Cloud(图像)、AFM 3 Cloud Pro(Agent/推理)。核心创新将200亿参数模型塞入手机:权重存闪存,自研Instruction-Following Pruning技术按Prompt路由专家而非逐Token,大幅降低搬运次数。最强算力依赖Google和NVIDIA。
BREAKING: WSJ reports OpenAI just made its first formal move toward IPO. it has confidentially filed draft paperwork for...
关联讨论 10 条Hacker News 热门(buzzing.cc 中文翻译)OpenAI:官网动态(RSS · 排除企业/客户案例)X:歸藏 (@op7418)The Verge:AI(RSS)IT之家(RSS)X:Testing Catalog (@testingcatalog)Bloomberg:Technology(RSS)X:Kim (@kimmonismus)The Decoder:AI News(RSS)TechCrunch:AI(RSS)Cognition 发布 FrontierCode 编码基准,评测 AI 生成的代码是否达到维护者可合并的质量,而非仅通过测试。基准含 150 个任务(Main 最难 100 个,Diamond 最难 50 个),由 20 余位开源维护者设计,每个任务耗时超 40 小时。评分设阻隔项(如破坏行为、缺失逻辑等)和加权项(可读性、类型安全等)。额外包含反向测试、范围检查、自适应评分。在 Diamond 子集上,Claude Opus 4.8 得分 13.4%,GPT-5.5 6.3%,Gemini 3.1 Pro 4.7%,开源最佳 Kimi K2.6 3.8%,显示顶尖模型在可合并代码上仍表现糟糕。
Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by ...
Kimi 利用 Agent Swarm 系统并行协调300个子智能体,分析战术、球员状态、伤病、赛程、天气、赔率等因素,预测2026年美加墨世界杯全部104场比赛,并发布每轮赛前预测和赛后回顾。模型层融合了 Elo/FIFA 强度、Poisson 进球分布、xG/xT 指标、蒙特卡洛模拟等方法。预测结果显示西班牙和法国为头号热门,但德国夺冠概率可能被市场低估:模型基线估计约11.0%,校准估计约11.3%,而部分市场隐含概率仅约7.4%,正向偏差约+3.6个百分点。该判断基于多分析链交叉验证,可能源于对德国近两届小组出局的近因偏差以及纳格尔斯曼高位压迫体系与穆西亚拉/维尔茨新创造轴的复苏信号。