AIHOT

🚨 AI News | TestingCatalog@testingcatalog · 6天前68

WWDC 🔥: New Apple Intelligence is built on top of Apple Foundation and Gemini models! Let's see what's inside 👀

译WWDC 🔥：新的Apple Intelligence基于Apple Foundation和Gemini模型！让我们看看里面有什么👀

Chubby♨️@kimmonismus · 6天前38

Apple Intelligence: -Personal Understand in apps. - Browse tools for web - on screen Awareness - in App usage

译Apple Intelligence: - 在应用中的个人理解。 - 网页浏览工具 - 屏幕感知 - 应用内使用

Chubby♨️@kimmonismus · 6天前52

Siri update - image understanding - more conversational - reworked voice-tone / sound. Sounds real human

译Siri 更新 - 图像理解 - 对话更自然 - 重制语音语调/声音，听上去像真人

Yuchen Jin@Yuchenj_UW · 6天前57

On the whole: “You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.” Loops are the temporary workaround: today’s LLMs have poor judgment. They struggle to know when to keep going, when to stop, or when to call a tool. Loops force agents to work longer. Loops are incredibly powerful for verifiable goals for now, as AutoResearch shows.

译总体来说： “你不应该再提示编码智能体了。你应该设计循环来提示你的智能体。” 循环是临时解决方案：如今的大语言模型判断力很差。它们难以判断何时继续、何时停止或何时调用工具。循环强制智能体更长时间地工作。对于目前可验证的目标，循环非常强大，正如AutoResearch所示。

Rohan Paul@rohanpaul_ai · 6天前63

This paper proposes a new test to see whether AI agents truly get better as they gain experience and finds they mostly still confuse memory with learning. Shows that simple full-context learning beats the more specialized memory systems, with Claude Sonnet 4.6 using plain context getting the best overall score. That distinction matters because the next wave of AI is not supposed to answer isolated prompts. It is supposed to live inside codebases, databases, markets, sensors, clinics, and workflows where yesterday’s mistake should make tomorrow’s action sharper. The authors build CL-BENCH, a benchmark where an agent works through connected tasks in 6 domains, including coding, databases, forecasting, radio signals, poker, and disease studies. Each task hides a pattern the agent can learn over time, like a database layout, a codebase structure, or an opponent’s strategy, so better performance should come from experience rather than pretraining. They test frontier LLM systems with simple full-context memory, scratchpad notes, retrieval memory, playbook-style memory, and coding-agent setups. The key finding is that current memory-heavy AI agents are not reliably better learners than just keeping the full conversation in context. That means long-running AI agents still need better ways to remember useful lessons, forget stale ones, and adapt when the environment changes. ---- Link – arxiv. org/abs/2606.05661 Title: "Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments"

译新论文构建 CL-BENCH 基准，评估 AI 智能体在编程、数据库、预测、无线电信号、扑克、疾病研究 6 个领域中的持续学习能力。每个任务隐藏可随时间习得的模式，考察智能体能否超越预训练知识。测试前沿 LLM 系统采用全上下文记忆、草稿笔记、检索记忆、剧本式记忆及编码智能体设置，结果发现当前记忆密集型 AI 智能体并未可靠优于简单保留完整对话上下文。Claude Sonnet 4.6 使用普通上下文取得最佳总体分数。论文指出智能体仍需更好方法记住有用经验、遗忘过时信息并适应环境变化。

Rohan Paul@rohanpaul_ai · 6天前49

AheadForm Elf V1 humanoid robot is slowly revealing its body design: its bionic skin uses magnetic attachment, making it easy to take off. This exposes the servo engines that power its facial expressions with 30 artificial facial muscles.

译AheadForm Elf V1 人形机器人正在逐步展示

Google Gemini@GeminiApp · 7天前54

We asked Gemini 3.5 Flash to bring back the classic early-2000s PC drawing experience, and it delivered in one shot. What are you building first with Canvas?

译我们让 Gemini 3.5 Flash 重现经典早期 2000 年代 PC 绘图体验，它一次就做到了。你打算先用 Canvas 构建什么？

ClaudeDevs@ClaudeDevs · 7天前74

Claude Code's first demo got two Slack reactions. One year after GA, @bcherny and @_catwu look back: verification best practices, why we built auto mode, routines and loops, and what's next. https://www.youtube.com/watch?v=Hth_tLaC2j8

译Claude Code 的第一个演示收到了两个 Slack 反应。 GA 一周年之际，@bcherny 和 @_catwu 回顾：验证最佳实践、为何构建自动模式、例程和循环，以及下一步计划。 https://www.youtube.com/watch?v=Hth_tLaC2j8

Chubby♨️@kimmonismus · 7天前33

Apple Intelligence last. Let the fun begin!

译蒂姆·库克仍在主持WWDC开幕。Apple Intelligence最后登场，好戏开始了！

Yuchen Jin@Yuchenj_UW · 7天前57

“You should design loops that prompt your agents.” Loops are the temporary workaround: today’s LLMs have poor judgment. They struggle to know when to keep going, when to stop, or when to call a tool. For verifiable goals, loops are incredibly powerful, as AutoResearch shows.

译“你应该设计循环来提示你的智能体。” 循环是临时方案：今天的LLM判断力很差。它们很难知道何时继续、何时停止、何时调用工具。对于可验证的目标，循环非常强大，正如AutoResearch所示。

宝玉@dotey · 7天前61

微信格局还是不够，总是想着大家都去他们家一亩三分地耕耘，还幻想着未来微信会继续是超级入口，人人都在用微信，所以只需要让 AI 去操作小程序。但现实是，未来微信的入口属性会越来越少，以后的年轻人，不会再去打开微信，只会问自己的 Agent：去帮我总结一下我昨天的群聊，去给我妈发条消息说晚上不回家吃饭了。而这个承担超级入口职责的 Agent，大概率不是微信 AI。

译微信发布《开发者接入微信 AI 生态的指引》，引导小程序开发者接入微信 AI，让 AI 控制小程序。宝玉对此评论称，微信试图通过让 AI 操作小程序来维持自身超级入口地位，但未来年轻人不会主动打开微信，而是直接向自己的 Agent（如"帮我总结群聊"或"给妈妈发消息"）发出指令。承担超级入口职责的很可能不是微信 AI。

Perplexity@perplexity_ai · 7天前76

We published new research with Harvard on the shift from chat interfaces to autonomous agents like Computer. Over 3 months, findings show workers using Computer finish tasks in 87% less time at 94% lower cost than Search alone, with higher satisfaction. https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work

译我们与哈佛大学发表新研究，关于从聊天界面转向像Computer这样的自主智能体的转变。超过3个月的研究结果表明，使用Computer的工人在完成任务上比仅使用搜索快87%，成本低94%，且满意度更高。 https://research.perplexity.ai/articles/how-ai-agents-reshape-knowledge-work

Odyssey@odysseyml · 7天前20

Learning From Experience

译从经验中学习

jason@jxnlco · 7天前4

14 weddings happened this weeekend and here I am working on my @aiDotEngineer talk

译这周末举办了14场婚礼，而我却在准备我的@aiDotEngineer演讲。

NotebookLM@NotebookLM · 7天前67

Forget about our users? Who? Us??? Please. These updates are rolling out globally on the web starting with Google AI Ultra and all Workspace business customers with AI Ultra Access and AI Expanded Access, however we *absolutely* plan to expand to others over time!

译NotebookLM 迎来重大更新，在对话中新增智能体能力、更高级推理及多种新输出格式，旨在简化复杂多步骤研究。该更新面向 Google AI Ultra 订阅者以及拥有 AI Ultra Access 和 AI Expanded Access 的 Workspace 业务客户率先推出，后续计划扩展至更多用户。

🚨 AI News | TestingCatalog@testingcatalog · 7天前48

GOOGLE 🔥: @NotebookLM now supports advanced agentic reasoning in chat and new output formats, including Excel sheets and images. Only Ultra subscribers 👀

译GOOGLE 🔥: @NotebookLM 现在在聊天中支持高级智能体推理，并新增了包括 Excel 表格和图片在内的输出格式。仅限 Ultra 订阅用户 👀

🚨 AI News | TestingCatalog@testingcatalog · 7天前66

OPENAI 🔥: Users can now generate interactive charts from data and comparisons in @ChatGPTapp for web and mobile. Testing time 👀

译OPENAI 🔥：用户现在可以在 @ChatGPTapp（网页和移动端）中从数据和比较生成交互式图表。

SemiAnalysis@SemiAnalysis_ · 7天前63

China's Unitree Will Dominate Global Robotics The Fastest Iteration Cycle In Next-Gen Robotics Should See Unprecedented Acceleration https://newsletter.semianalysis.com/p/chinas-unitree-will-dominate-global

译中国宇树将主导全球机器人下一代机器人最快的迭代周期将迎来前所未有的加速

Chubby♨️@kimmonismus · 7天前63

What many misunderstand: Apple doesn't actually need the best model in the world. It's similar to Meta. Their model only needs to be good enough for 99% of everyday use cases. They don't even want to compete with Frontier Labs, but primarily reach the consumer market. And Apple actually has a good chance there. Because a well-adapted Gemini model, based on (3.1/3.5?) and well integrated into the OS, could achieve exactly the use case that many need: AI that simplifies their daily work.

译苹果在WWDC 2026上承认无法独立构建前沿AI，转而与Google合作。新Siri将基于定制1.2T参数Gemini模型（版本或为3.1/3.5），每年花费约10亿美元（Gurman）。Siri作为独立应用，支持iMessage式聊天、动态岛弹窗、扩展系统及邮件/日历/网页查询，运行在Private Cloud Compute上，Google不会用查询数据训练。苹果策略类似Meta——模型只需满足99%日常场景。iOS 27被定位为“雪豹”式清理更新，放弃iPhone 11和SE2支持，并可能允许用户选择AI引擎（Gemini或Claude）。

AYi@AYi_AInotes · 7天前37

梁文锋的DeepSeek的问世成功颠覆了什么？

gabriel@gabriel1 · 7天前48

people thought agi is a blank textbox where we enter our intentions and get what we want but ask your smartest friend to "clean up my inbox", you'll realize that for him to do it perfectly you'd need to write down 5 pages of instructions, and these instructions changes daily

译人们以为 AGI 是一个空白文本框，输入意图就能得到想要的东西。但让你最聪明的朋友去“清理我的收件箱”，你会发现为了让他完美完成，你需要写下 5 页的指令，而且这些指令每天都在变化。

ChatGPT@ChatGPTapp · 7天前67

Turn data and comparisons into charts, directly in ChatGPT. Available now on mobile and web.

译将数据和比较转化为图表，直接在 ChatGPT 中完成。现已支持移动端和网页端。

jason@jxnlco · 7天前17

What artifacts do you create in codex outside code?

译你在Codex中除了代码还创建了哪些工件？

NotebookLM@NotebookLM · 7天前72

Introducing a more powerful NotebookLM 🚀 Massive upgrades deliver agentic capabilities in chat, more advanced reasoning, and a suite of new output formats. Tackling complex, multi-step research problems has never been easier. Rolling out now to Google AI Ultra subscribers.

译推出更强大的 NotebookLM 🚀 重大升级带来了对话中的智能体能力、更高级的推理以及一系列新的输出格式。处理复杂的多步骤研究问题从未如此简单。现已面向 Google AI Ultra 订阅者推出。

Runway@runwayml · 7天前79

One video, now made for every feed and format. Upload your existing video, choose your desired aspect ratio and watch our editing model, Aleph 2.0, fill in the rest of the scene as if you made it that way from the start. Try it on our desktop web app at the link below.

译一个视频，现在可以为每个信息流和格式制作。上传你现有的视频，选择你想要的宽高比，然后观看我们的编辑模型 Aleph 2.0，填充场景的其余部分，就像你从一开始就这样制作一样。在我们的桌面 Web 应用上尝试，链接如下。

OpenRouter@OpenRouter · 7天前66

This month is, unsurprisingly, Cost Reduction Month. In our data from the last 3 yrs, we commonly see major cost crunches right after the latest breakthrough. We'll ship major features to help you cut inference costs at least once a week, starting with today. Running list 👇

译本月不出所料是成本削减月。根据我们过去3年的数据，重大突破之后往往会出现成本压力。我们将从今天开始，每周至少推出一次主要功能，帮助您降低推理成本。持续更新列表 👇

🚨 AI News | TestingCatalog@testingcatalog · 7天前42

NotebookLM updates soon 👀 We are expecting Gemini 3.5 Flash and Gemini Omni upgrades, aside a bunch of new features. Which ones do you want the most?

译NotebookLM 即将更新 👀 我们期待 Gemini 3.5 Flash 和 Gemini Omni 升级，此外还有一堆新功能。你最想要哪个？

elvis@omarsar0 · 7天前65

Great tips. In practice, this is how it roughly looks to run agents autonomously for hours or days. /goal or /loop to keep it going. Verification is crucial here.

译@bcherny 分享5条技巧：1) 开启自动权限模式，免手动确认；2) 采用动态工作流，让Opus协调数百/数千Agent；3) 使用/goal或/loop指令促使持续执行；4) 在云端运行Claude Code，可关闭笔记本；5) 确保Opus能端到端自验证——通过Chrome扩展验证网页、iOS/Android模拟MCP验证移动端、启动完整后端服务验证后端。Elvis Saravia强调/goal/loop和验证是关键。

Chubby♨️@kimmonismus · 7天前54

WWDC 2026 - Apple rents Google's brain to fix Siri. What we can expect: Apple's keynote today is a software reset built around one admission: it couldn't build frontier AI alone. What to expect: - Gemini-powered Siri, a rebuilt assistant on a custom 1.2T-parameter Google model, ~$1B/year (Gurman). Runs via Private Cloud Compute, no Google training on your queries. - Siri as an app, standalone, iMessage-style chat with synced history, a "Search or Ask" Dynamic Island pop-up, and an Extensions system. Drafts emails, pulls from mail, calendar, contacts and the web. - Six OS betas - iOS 27, iPadOS 27, macOS 27 ("Big Bear"), watchOS 27, tvOS 27, visionOS 27. iOS 27 is a "Snow Leopard" cleanup release. iPhone 11 and SE2 lose support. - Liquid Glass 2.0 - system-wide opacity slider, fixes for the shadow and transparency complaints. - AI health coach - the watered-down "Health+", now fitness and wellness instead of an AI doctor (pretty cool!) - Model choice (rumored), users may pick the engine behind Apple Intelligence, with Gemini and Claude floated. - Hardware mostly later - M5 Macs, new iMac, foldable iPhone (~$2.5K, Sept), OLED touchscreen MacBook Pro, smart-home hub. Sources: TechInsider, Bloomberg, Gamebezz

译苹果在WWDC 2026承认无法独自构建前沿AI，与Google合作，基于1.2T参数的Gemini模型重建Siri，年费约10亿美元。新版Siri通过Private Cloud Compute运行，谷歌不训练用户数据；Siri成为独立应用，支持聊天、同步历史、Dynamic Island弹出和扩展系统，可起草邮件并获取信息。六个OS beta发布，iOS 27为Snow Leopard清理版，iPhone 11/SE2失去支持；Liquid Glass 2.0透明度滑块；Health+改为健身聚焦；用户或可选AI引擎（Gemini或Claude）。硬件稍后推出。

Deedy@deedydas · 7天前64

Meta AI has shockingly grown 2.5x in the last 2mos and is poised to be the #3 AI consumer app in the world behind Gemini and ChatGPT. Sadly, this growth is very likely inorganic given it has by far the worst retention by a mile: only 4.5% users stay in 30 days.

译Meta AI 在过去两个月内惊人地增长了 2.5 倍，有望成为仅次于 Gemini 和 ChatGPT 的全球第三大 AI 消费级应用。遗憾的是，这种增长很可能是非有机的，因为它的留存率迄今最差：只有 4.5% 的用户会在 30 天后继续使用。

OpenRouter@OpenRouter · 7天前72

New server tool: Advisor Let smaller models consult a higher-intelligence "advisor" model. Helps them escape doom loops, and helps you migrate to cheaper models! 🧵

译新服务器工具：Advisor 让较小的模型咨询一个更高智能的“顾问”模型。帮助它们逃出困境循环，并帮助你迁移到更便宜的模型！🧵

jason@jxnlco · 7天前57

It can also do handoffs

译Codex 现在可以为你自主启动新的聊天，并显示在侧边栏中——当你正在处理另一项任务时发现一个问题，这个功能非常方便。此外，它还可以进行任务交接。

歸藏(guizang.ai)@op7418 · 7天前5

最近可以在各种媒体和账号上，看到我的那个 Skills 推荐。感谢量子位。

译最近可以在各种媒体和账号上，看到我的那个 Skills 推荐。感谢量子位。 [引用 @op7418]：http://x.com/i/article/2053655813877870592

Chubby♨️@kimmonismus · 7天前78

New from Hivemind: continual learning for AI coding agents, available to everyone starting today. It takes the traces from every agent your team runs (Claude Code, Codex, Cursor, Hermes, Pi) and turns them into reusable skills, then pushes those skills across all of them, all on your own cloud! With the new SkillOpt built in, those skills get trained as they accumulate: +19.1 points of accuracy in Claude Code +24.8 in Codex best or tied on all 52 setups tested Agents that learn on the job and share what they learn. Really exciting.

译Hivemind发布面向AI编程智能体的持续学习功能，即日起开放。该工具收集团队运行的每个智能体（Claude Code、Codex、Cursor、Hermes、Pi）的轨迹，转化为可复用技能并推送到所有智能体，数据存储在用户自己的云存储中。内置SkillOpt使技能持续训练：Claude Code准确率提升+19.1分，Codex提升+24.8分，在全部52个测试设置中最佳或持平。开源，一行命令安装。

Rohan Paul@rohanpaul_ai · 7天前63

Coinbase CEO Brian Armstrong thinks AI demand is almost limitless, but he expects 80% of workloads to shift to models that are 99% cheaper within 12-18 months.

译Coinbase CEO Brian Armstrong 预测，对智能的需求近乎无限，但 80% 的工作负载将在 12-18 个月内迁移到便宜 99% 的模型，仅 20% 继续运行在追求最高 IQ 的最新模型上（如科学突破、高级编排型 AI 智能体）。他类比高端 MacBook/游戏 PC 的配置占比，但指出模型价格下降远超摩尔定律。Armstrong 认为未来瓶颈是能源和算力，而非更好的模型。Coinbase 正将用户提示词路由到更便宜的模型，部分情况下 token 用量指数增长，成本基本持平。

SenseTime@SenseTime_AI · 7天前56

🙌 Shoutout to @FahdMirza for demoing 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨𝟭’𝘀 𝘁𝗲𝘅𝘁-𝗶𝗺𝗮𝗴𝗲 𝗶𝗻𝘁𝗲𝗿𝗹𝗲𝗮𝘃𝗲𝗱 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 — showing the step-by-step process of formulating a custom perfume 🧴✨ It doesn't just see images. It thinks in them — and outputs in vivid visuals 🎥 https://youtu.be/-uedweS3_w0 Explore prompt examples in SenseTime Studio's Gallery and build your own 👇 🎛️ SenseNova Studio: https://unify.light-ai.top/ (Try infographics; also join Discord for text-image interleaved gen) 🤗 https://huggingface.co/collections/sensenova/sensenova-u1 🛠️ https://github.com/OpenSenseNova/SenseNova-U1 👾 Discord: https://discord.com/invite/BuTXPHmQub

译商汤 SenseTime 展示 SenseNova U1 的文图交错生成能力，通过定制香水逐步演示，证明模型不仅能识别图像，还能以图像为思考单元输出生动视觉内容。相关示例、Gallery、HuggingFace 模型、GitHub 代码及 Discord 社区链接已同步开放。

gabriel@gabriel1 · 7天前40

every job will turn into explaining your intentions to ai explaining what you want to ai is surpringly time consuming, coders already spend 80% of their time doing it, and this will be true for everyone

译每份工作都将变成向 AI 解释你的意图向 AI 解释你想要什么其实相当耗时，程序员已经有 80% 的时间花在这上面，而这对每个人来说都将如此。

🚨 AI News | TestingCatalog@testingcatalog · 7天前69

KIMI 🔥: A new "Kimi for Work" AI Agent has been released with support for Native Agent Swarm, Browser Use, and more! > The app is available on both macOS and Windows. > Users can spawn up to 300 agents locally. > Browser Use is working as part of the earlier-released WebBridge. > Kimi for Work is powered by its own Memory System.

译Kimi for Work AI Agent已发布，支持原生Agent Swarm（多智能体群）、Browser Use（通过WebBridge实现）以及自有记忆系统。该应用可在macOS和Windows上运行，用户可本地启动多达300个智能体。官方表示这仅是开始，未来将增加更多数据源、工具和Agent能力。

OpenBMB@OpenBMB · 7天前75

🚀 VoxCPM2 Technical Report is now available on arXiv! VoxCPM2 is the latest speech generation model in the VoxCPM family. Built with 2B parameters and trained on over 2 million hours of multilingual speech data, it supports 30 languages and 9 Chinese dialects, along with natural-language voice design, controllable voice cloning, and high-fidelity continuation-based voice cloning. In this technical report, we provide a comprehensive overview of: 🔹 The VoxCPM2 architecture 🔹 A unified sequence formulation for speech generation and control 🔹 The design of AudioVAE for high-fidelity speech reconstruction 🔹 Large-scale multilingual training and evaluation 🔹 Benchmark results across zero-shot and instruction-following TTS tasks With 16kHz semantic encoding and 48kHz waveform reconstruction, VoxCPM2 delivers high-quality speech generation and achieves SOTA or highly competitive performance on public TTS benchmarks. To support open research and development, we have open-sourced the model weights, fine-tuning code, and inference toolkit under the Apache 2.0 license. 📄 Paper: https://arxiv.org/abs/2606.06928 💻 GitHub: https://github.com/OpenBMB/VoxCPM We hope VoxCPM2 helps advance the open-source multilingual speech ecosystem. Feedback, experiments, and contributions are always welcome! 🔥 #AI #OpenSource #TTS #SpeechAI #VoiceAI #GenerativeAI #MachineLearning

译面壁智能 OpenBMB 发布 VoxCPM2 技术报告。该模型为最新语音生成模型，拥有 2B 参数，基于超 200 万小时多语言语音数据训练，支持 30 种语言和 9 种中文方言。具备自然语言语音设计、可控及高保真延续性语音克隆能力。技术报告涵盖架构设计、统一序列公式、AudioVAE 高保真语音重建、大规模训练评估，以及零样本和指令跟随 TTS 基准结果。采用 16kHz 语义编码 + 48kHz 波形重建，在公开 TTS 基准上达到 SOTA 或极具竞争力。模型权重、微调代码和推理工具以 Apache 2.0 开源。

Xiaomi MiMo@XiaomiMiMo · 7天前82

🚀 1,000+ TOKENS/S ON A 1T MODEL! 🚀 We are thrilled to release Xiaomi MiMo-V2.5-Pro-UltraSpeed in collaboration with @TileRT_AI , breaking the 1,000 tokens/s output speed on a 1 Trillion parameter model for the FIRST TIME! Not wafer-scale integration like Cerebras. Not pure on-chip SRAM chips like Groq. We achieve 1,000 tps on a 1T MoE model using just a SINGLE, STANDARD 8-GPGPU NODE. Read the full technical deep dive：https://mimo.xiaomi.com/blog/mimo-tilert-1000tps Want to experience the future of real-time AI? 👉 Apply for UltraSpeed now: https://platform.xiaomimimo.com/ultraspeed ⏳ Limited-Time Access: Application-based · Jun 8 – Jun 23 (PDT) 💬 Chat Experience: Completely FREE for a limited time — try the blazing-fast web chat now. ⚡ UltraSpeed API: Just 3x the price for a ~10x boost in output experience. 🤝 Enterprise & Large-Scale Needs: business-mimo@xiaomi.com

译小米 MiMo 联合 TileRT_AI 发布 MiMo-V2.5-Pro-UltraSpeed，首次在 1 万亿参数 MoE 模型上实现超过 1,000 tokens/s 输出速度，仅用单台标准 8-GPGPU 节点（非 Cerebras 或 Groq 方案）。提供限时免费聊天体验，UltraSpeed API 价格为 3 倍，输出体验提升约 10 倍。申请时间为 6 月 8 日至 23 日（PDT），企业可邮件联系 business-mimo@xiaomi.com。