NVIDIA在Computex上发布了Nemotron 3 Ultra,总参数达550B(激活参数55B),是目前最大的Nemotron 3模型。该模型在美国开放权重模型中智能性最强,在Artificial Analysis Intelligence Index评测中得分为48,超越了Gemma 4 31B(39分),但仍落后于月之暗面(Kimi)的K2.6(54分)。在推理速度方面,其在预发布端点上超过了300 tokens/s,远高于同级别中国模型通常的50-100 tokens/s。该模型将提供BF16权重及NVFP4量化版本以提升推理性能。
NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model
We partnered with @nvidia to evaluate this model for intelligence and speed - these figures use the model's BF16 weights, but as with Nemotron 3 Super the model will be made available in NVFP4 quantization as well for higher inference performance.
➤ New leader for US open weights intelligence: Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This is well ahead of the next strongest US open weights models, Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33), but behind the Chinese-led open weights frontier (Kimi K2.6 at 54).
➤ Leading speed for its intelligence: on a pre-release @DeepInfra endpoint, Nemotron 3 Ultra served over 300 tokens per second. Peer models in its size class from China-based labs such as DeepSeek and Moonshot (Kimi) are generally served at speeds of 50-100 tokens per second in the market today. gpt-oss-120b is served at speeds similar to this level, but with significantly lower intelligence.
➤ Largest Nemotron 3 model so far: at approximately 550 billion total parameters and 90% sparsity, Nemotron 3 Ultra is significantly larger than its siblings and is the largest recent US open weights model release
We'll be sharing additional analysis and full benchmarks at release.