abstract:47c022fe88ded837.tex

1: \begin{abstract}

2: In this paper we propose a new generative model of text, \fullmodel{} (\model{}), that does not rely on autoregressive models. Similarly to denoising diffusion techniques, \model{} is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence. We present a simple new improvement operator that converges in fewer iterations than diffusion methods, while qualitatively producing better samples on natural language datasets. \model{} achieves state-of-the-art results (among non-autoregressive methods) on the WMT'14 English-to-German translation task and good qualitative results on unconditional language modeling on the Colossal Cleaned Common Crawl dataset and \github{}. The non-autoregressive nature of \model{} opens up possibilities beyond left-to-right prompted generation, by filling in arbitrary blank patterns in a template.

3: \end{abstract}

4: