Issue #56 – Scalable Adaptation for Neural Machine Translation

17 Oct19

Issue #56 – Scalable Adaptation for Neural Machine Translation

Author: Raj Patel, Machine Translation Scientist @ Iconic

Although current research has explored numerous approaches for adapting Neural MT engines to different languages and domains, fine-tuning remains the most common approach. In fine-tuning, the parameters of a pre-trained model are updated for the target language or domain in question. However, fine-tuning requires training and maintenance of a separate model for each target task (i.e. a separate MT engine for every domain, client, language, etc). In addition to the growing number of models, fine-tuning requires very careful tuning of hyper-parameters (eg. learning rate, regularisation, etc.) during adaptation, and is prone to rapid over fitting. This sensitivity even worsens for the high capacity (bigger size) models. In this post, we will discuss a simple yet efficient approach to handling multiple domains and languages proposed by Bapna et al., 2019.

NMT 56 figure 1 Scalable Adaptation for NMT

Approach

The proposed approach consists of two phases: 

  1. Training a generic base model 
  2. Adapting it to new tasks
    To finish reading, please visit source site

Leave a Reply