Leveraging Pre-trained Language Model Checkpoints for Encoder-Decoder Models
Transformer-based encoder-decoder models were proposed in Vaswani et
al. (2017) and have recently
experienced a surge of interest, e.g. Lewis et al.
(2019), Raffel et al.
(2019), Zhang et al.
(2020), Zaheer et al.
(2020), Yan et al.
(2020).
Similar