Gentle Introduction to Global Attention for Encoder-Decoder Recurrent Neural Networks

Last Updated on August 14, 2019

The encoder-decoder model provides a pattern for using recurrent neural networks to address challenging sequence-to-sequence prediction problems such as machine translation.

Attention is an extension to the encoder-decoder model that improves the performance of the approach on longer sequences. Global attention is a simplification of attention that may be easier to implement in declarative deep learning libraries like Keras and may achieve better results than the classic attention mechanism.

In this post, you will discover the global attention mechanism for encoder-decoder recurrent neural network models.

After reading this post, you will know:

  • The encoder-decoder model for sequence-to-sequence prediction problems such as machine translation.
  • The attention mechanism that improves the performance of encoder-decoder models on long sequences.
  • The global attention mechanism that simplifies the attention mechanism and may achieve better results.

Kick-start your project with my new book Long Short-Term Memory Networks With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.