Apple 发布全新基础模型家族,亮点是 AFM 3 Core Advanced:200 亿参数,完全运行在 iPhone 17 Pro 设备端。通过将完整模型存于闪存,每次仅加载 1-4B 专家参数到活跃内存,巧妙绕过 DRAM 瓶颈,实现设备端更生动的语音和更精准的听写。共 5 个模型,与 Google 合作打造,覆盖从设备端到 Private Cloud Compute 的云端模型,最高性能云端模型运行在 NVIDIA GPU 上。
Apple's new foundation models are genuinely exciting. The standout is AFM 3 Core Advanced, a 20-billion (!) parameter model that runs entirely on-device.
Read that again. 20-billion, on-device, iPhone 17 Pro.
It pulls this off by keeping the full model in flash memory and loading only a small slice of "experts" into active memory for each prompt, just 1 to 4 billion parameters at a time. That's a clever way to get around the usual DRAM wall, and it's what unlocks things like expressive voices and much sharper dictation right on the device.
The whole family of five models was built in collaboration with Google. It spans these on-device models all the way up to server-based ones running on Private Cloud Compute, with the most demanding cloud model running on NVIDIA GPUs.
Kudos, Apple!