77
AI 摘要
15.6× faster decoding at 1M tokens 🔥 感谢 @FireworksAI_HQ 为 M3 提供推理支持。 立即尝试 👇
15.6× faster decoding at 1M tokens 🔥
Thanks @FireworksAI_HQ for powering the inference behind M3.
Try it now 👇
MiniMax M3 arrives with MiniMax Sparse Attention (MSA), 15.6x faster decoding at 1M tokens. We're partnering with @MiniMax_AI to power the inference behind this...