Google Gemini Omni重新定义视频生成

AYi@AYi_AInotes

2026-05-20 02:16·26天前

AI 摘要

Google推出Gemini Omni，首个面向消费者的世界模型。它通过自然语言交互，将Gemini的智能与生成媒体系统结合，实现了对物理规律、历史、生物等世界的深刻理解。用户可以像编辑ChatGPT文本一样用单句指令编辑视频，实现人物一致性、风格迁移、角度调整等功能。它不是单纯生成像素，而是模拟连贯的物理与语义世界，标志着AI视频生成从拼接工具向智能创作系统的飞跃。

Damn！ Google has really gone absolutely wild this time. Gemini Omni is about to blow the roof off the ceiling of video generation 🤯 Making videos used to be like building with Lego blocks， piece by piece， slowly. Now it's giving you a magic Lego factory that can actually think. You chat in natural language， and it understands real-world physics， history， biology， culture-then directly generates or edits any video. Five most mind-blowing abilities that you can use right now： 1Understands real physics-glass marbles colliding， turning， and bouncing in ways that match reality. 2Faces never get distorted-define a character once， put them in any scene， any action. 3Edit videos like you edit ChatGPT text-change backgrounds， swap people， add effects with a single sentence. 4Upload an image and apply any style-make claymation， visualize protein folding， whatever you imagine. 5Video isn't a dead file anymore-change angles， lighting， objects， even storylines just by chatting. This isn't a competitor to Sora. This is the first time a world model has truly entered a consumer-facing product. It's not just generating pixels-it's simulating a coherent physical and semantic world. Open the Gemini app right now and try Omni Flash. Go try it. You'll thank me later.

Google DeepMindWe're dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video. It combines Gemini's intelligence with ...

DeepMindGoogle图像生成多模态

在 X 查看原推

AYi@AYi_AInotes · X

2026-05-20 02:16·26天前

AI 摘要

Google DeepMindWe're dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video. It combines Gemini's intelligence with ...

DeepMindGoogle图像生成多模态模型发布视频