Transformers-based Encoder-Decoder Models
!pip install transformers==4.2.1 !pip install sentencepiece==0.1.95 The transformer-based encoder-decoder model was introduced by Vaswani et al. in the famous Attention is all you need paper and is today the de-facto standard encoder-decoder architecture in natural language processing (NLP). Recently, there has been a lot of research on different pre-training objectives for transformer-based encoder-decoder models, e.g. T5, Bart, Pegasus, ProphetNet, Marge, etc…, but the model architecture has stayed largely the same. The goal of the blog post is to give an […]
Read more