Issue #46 – Augmenting Self-attention with Persistent Memory

18 Jul19 Issue #46 – Augmenting Self-attention with Persistent Memory Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic In Issue #32 we introduced the Transformer model as the new state-of-the-art in Neural Machine Translation. Subsequently, in Issue #41 we looked at some approaches that were aiming to improve upon it. In this post, we take a look at significant change in the Transformer model, proposed by Sukhbaatar et al. (2019), which further improves its performance. Each Transformer layer consists of two types […]