Google DeepMind 发布 Gemma 4 12B:开源多模态模型,16GB 内存笔记本即可运行
Gemma 4 12B 是 Google DeepMind 推出的开源模型,原生支持处理文本、图像和音频,仅需 16GB RAM 即可在笔记本上运行。在基准测试中几乎追平两倍参数规模的 26B 模型,采用 Apache 2.0 许可证,可用于商业用途。
Google Deepmind's Gemma 4 12B squeezes multimodal AI onto a laptop with just 16 GB of RAM
Google Deepmind has released Gemma 4 12B, an open AI model that brings multimodal capabilities to everyday laptops. It processes text, images, and audio natively without separate encoders, cutting processing time, memory use, and latency, according to Google. The model runs locally with just 16 GB of RAM and nearly matches the 26B model—twice its size—across benchmarks, Google says. It's also the first mid-sized Gemma model with native audio processing.
Gemma 4 12B handles speech recognition, code generation, and video analysis. Per the Developer Guide, it can parse multi-minute video clips by analyzing frames and audio together. In one demo, it chewed through a five-minute Google I/O keynote clip: 313 frames at one per second, plus audio.
The model is available on Hugging Face, Ollama, LM Studio, and other platforms, licensed under Apache 2.0 for commercial use.
AI News Without the Hype – Curated by Humans