Gemma 4 QAT 模型发布：本地设备内存需求低至 1GB

Chubby♨️@kimmonismus

2026-06-06 02:03·9天前

AI 摘要

Google DeepMind 发布 Gemma 4 QAT 量化感知训练模型，专为本地 / 设备端优化。通过量化感知训练减少内存占用，同时相比标准训练后量化保留更多质量。支持 Q4_0 格式及新的移动专用量化格式。Gemma 4 E2B 版本可运行于约 1GB 内存，纯文本版本甚至低于 1GB，使手机、笔记本、边缘设备和消费级 GPU 上的本地 AI 更实用。

Google DeepMind released new Gemma 4 QAT models that make the model family much more efficient for local， on-device use.

Using Quantization-Aware Training， the models are trained with compression in mind， which reduces memory needs while preserving more quality than standard post-training quantization. The release includes support for the popular Q4_0 format and a new mobile-specialized quantization format.

Gemma 4 E2B can now run with around 1GB of memory （！）， and the text-only version can even require less than 1GB （！）. That makes local AI on phones， laptops， edge devices， and consumer GPUs far more practical.

Really cool to see.

DeepMindGoogle模型发布端侧

在 X 查看原推

Chubby♨️@kimmonismus · X

2026-06-06 02:03·9天前

AI 摘要

Google DeepMind released new Gemma 4 QAT models that make the model family much more efficient for local， on-device use.

Really cool to see.

DeepMindGoogle模型发布端侧部署/工程

在 X 查看原推x.com