ELF:嵌入式语言流
研究团队提出嵌入式语言流(ELF),这是一种基于连续时间流匹配、在连续嵌入空间中运行的扩散语言模型。与主流离散扩散模型不同,ELF在绝大部分采样过程中保持在连续空间,仅在最后一步通过共享权重网络映射到离散词元。这一设计使其能直接借鉴图像扩散模型的成熟技术(如无分类器引导)。实验表明,ELF在生成质量上显著优于当前领先的离散和连续扩散语言模型,并能以更少的采样步骤实现更优性能,为构建有效的连续扩散语言模型提供了新路径。
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulation makes it straightforward to adapt established techniques from image-domain diffusion models, e.g., classifier-free guidance (CFG). Experiments show that ELF substantially outperforms leading discrete and continuous DLMs, achieving better generation quality with fewer sampling steps. These results suggest that ELF offers a promising path toward effective continuous DLMs.