ZAYA1-8B小模型展现超强推理能力，采用AMD全栈方案

Chubby♨️@kimmonismus

2026-05-07 19:17·39天前

AI 摘要

Zyphra发布ZAYA1-8B模型，其活跃参数不足10亿，却在数学、编程和推理基准测试中媲美更大的开源及专有系统。其亮点不仅在于小尺寸，更在于全栈技术方案：完全基于AMD基础设施训练，采用了新的架构选择和大规模强化学习。此外，模型应用了一种名为Markovian RSA的测试时计算方法，通过并行推理和递归聚合，显著提升了复杂数学问题的解决能力。

Zyphra under 1B active parameters， AMD-Trained， big evals， look strong？

Zyphra says its new ZAYA1-8B model delivers unusually high reasoning power for its size， using under 1 billion （！） active parameters while competing with much larger open-weight and proprietary systems on math， coding， and reasoning benchmarks.

The interesting part is not just the model's size， but its full-stack bet： AMD-only training infrastructure （！）， new architectural choices， large-scale RL， and a test-time compute method called Markovian RSA that appears to boost hard math performance through parallel reasoning and recursive aggregation.

推理模型发布端侧

在 X 查看原推

Chubby♨️@kimmonismus · X

2026-05-07 19:17·39天前

AI 摘要

Zyphra under 1B active parameters， AMD-Trained， big evals， look strong？

推理模型发布端侧

在 X 查看原推x.com