How to generate text: using different decoding methods for language generation with Transformers
Note: Edited on July 2023 with up-to-date references and examples. Introduction In recent years, there has been an increasing interest in open-ended language generation thanks to the rise of large transformer-based language models trained on millions of webpages, including OpenAI’s ChatGPT and Meta’s LLaMA. The results on conditioned open-ended language generation are impressive, having shown to generalize to new tasks, handle code, or take non-text data as input. Besides the improved transformer architecture and massive unsupervised training data, better decoding […]
Read more