The Attention Mechanism from Scratch

The attention mechanism was introduced to improve the performance of the encoder-decoder model for machine translation. The idea behind the attention mechanism was to permit the decoder to utilize the most relevant parts of the input sequence in a flexible manner, by a weighted combination of all the encoded input vectors, with the most relevant vectors being attributed the highest weights. 

In this tutorial, you will discover the attention mechanism and its implementation. 

After completing this tutorial, you will know:

  • How the attention mechanism uses a weighted sum of all the encoder hidden states to flexibly focus the attention of the decoder on the most relevant parts of the input sequence
  • How the attention mechanism

     

     

    To finish reading, please visit source site