Implementation of Attention Mechanism for Caption Generation on Transformers using TensorFlow

Overview

  • Learning about the state of the art model that is Transformers.
  • Understand how we can implement Transformers on the already seen image captioning problem using Tensorflow
  • Comparing the results of Transformers vs attention models.

 

Introduction

We have seen that Attention mechanisms (in the previous article) have become an integral part of compelling sequence modeling and transduction models in various tasks (such as image captioning), allowing modeling of dependencies without regard to their distance in the input or output sequences.

Attention mechanism image caption

The Transformer, a model architecture eschewing recurrence and instead relying entirely on an attention mechanism to draw global dependencies between input and output. The Transformer architecture allows for significantly more parallelization and can reach new state of the art results in translation quality.

In this article, let’s see how you can implement the Attention Mechanism for Caption Generation with Transformers using TensorFlow.

Prerequisites before

 

 

 

To finish reading, please visit source site

Leave a Reply