Implementing the Transformer Decoder from Scratch in TensorFlow and Keras

There are many similarities between the Transformer encoder and decoder, such as their implementation of multi-head attention, layer normalization, and a fully connected feed-forward network as their final sub-layer. Having implemented the Transformer encoder, we will now go ahead and apply our knowledge in implementing the Transformer decoder as a further step toward implementing the complete Transformer model. Your end goal remains to apply the complete model to Natural Language Processing (NLP).

In this tutorial, you will discover how to implement the Transformer decoder from scratch in TensorFlow and Keras. 

After completing this tutorial, you will know:

  • The layers that form part of the Transformer decoder
  • How to implement the Transformer decoder from scratch

Kick-start your

 

 

To finish reading, please visit source site