Google AI 推出音频模型 Gemini 3.5 Live Translate,为开发者提供低延迟实时语音翻译,支持 70+ 种语言。模型具备多语言输入(同会话无需切换)、自动语言检测、原生音频处理(保留说话者语调、语速和音高)以及噪声鲁棒性(过滤环境噪音),可直接处理流式语音。
Our latest audio model, Gemini 3.5 Live Translate, takes real-time speech translation to the next level for developers by delivering low-latency translation across 70+ languages.
By processing speech as it streams in near real time, the model enables devs to build low-latency audio experiences with:
- Multilingual input: Understands multiple languages in a single session without needing to adjust settings. - Auto-detection: Identifies the spoken language and begins translation instantly. - Native audio processing: Generates more natural-sounding speech that preserves speakers' intonation, pacing, and pitch. - Noise robustness: Filters out ambient noise for clearer conversation in loud environments.