Google推出Gemma 4 12B无编码器多模态模型

Google AI Developers@googleaidevs

2026-06-04 00:07·13天前

AI 摘要

Google发布Gemma 4 12B，一款无编码器的统一多模态模型，可直接将视觉和音频输入送入LLM主干，无需传统多模态编码器。该模型填补了移动端E4B模型与26B MoE模型之间的空白，封装前沿推理与原生音频能力，采用Apache 2.0许可。在16GB VRAM下即可本地运行复杂多步骤智能体工作流，性能接近26B模型。

We're launching Gemma 4 12B： Our unified， encoder-free model that brings powerful multimodal intelligence straight to your laptop 🚀

The model bridges the gap between our mobile E4B model and larger 26B MoE models， packaging frontier-class reasoning and native audio into a highly optimized footprint， all under a permissive Apache 2.0 license.

Here's what makes it unique：

+ Encoder-Less Architecture： We removed the multimodal encoders. The vision and audio inputs flow directly into the LLM backbone. + Agentic Performance （16GB VRAM）： Run complex， multi-step workflows locally， with performance nearing our 26B model.

Google多模态开源生态模型发布

在 X 查看原推

Google AI Developers@googleaidevs · X

2026-06-04 00:07·13天前

AI 摘要

We're launching Gemma 4 12B： Our unified， encoder-free model that brings powerful multimodal intelligence straight to your laptop 🚀

Here's what makes it unique：

Google多模态开源生态模型发布端侧

在 X 查看原推x.com