Articles About Natural Language Processing

Issue #82 – Constrained Decoding using Levenshtein Transformer

14 May20 Issue #82 – Constrained Decoding using Levenshtein Transformer Author: Raj Patel, Machine Translation Scientist @ Iconic Introduction In constrained decoding, we force in-domain terminology to appear in the final translation. We have previously discussed constrained decoding in earlier blog posts (#7, #9, #79). In this blog post, we will discuss a simple and effective algorithm for incorporating lexical constraints in Neural Machine Translation (NMT) proposed by Susanto et al. (2020) and try to understand how it is better than […]

Read more

Issue #81 – Evaluating Human-Machine Parity in Language Translation: part 2

07 May20 Issue #81 – Evaluating Human-Machine Parity in Language Translation: part 2 Author: Dr. Sheila Castilho, Post-Doctoral Researcher @ ADAPT Research Centre This is the second in a 2-part post addressing machine translation quality evaluation – an overarching topic regardless of the underlying algorithms. Following our own summary last week, this week we are delighted to have one of the paper’s authors, Dr. Sheila Castilho, give her take on the paper, their motivations for writing it, and where we […]

Read more

Issue #79 -Merging Terminology into Neural Machine Translation

23 Apr20 Issue #79 -Merging Terminology into Neural Machine Translation Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic After several years being the state of the art in Machine Translation, neural MT still doesn’t have a convenient way to enforce the translation of custom terms according to a glossary. In issue #7, we reviewed several approaches to handle terminology in neural MT. Just adding the glossary to the training data is not effective. Replacing the source term by a […]

Read more

Issue #78 – Balancing Training data for Multilingual Neural MT

16 Apr20 Issue #78 – Balancing Training data for Multilingual Neural MT Author: Raj Patel, Machine Translation Scientist @ Iconic Multilingual Neural MT (MNMT) can translate to/from multiple languages, but in model training we are faced with imbalanced training sets. This means that some languages have much more training data compared to others. In general, we up-sample the low resource languages to balance the representation. However, the degree of up-sampling has a large effect on the overall performance of the model. […]

Read more

Issue #77 – Neural MT with Subword Units Using BPE-Dropout

09 Apr20 Issue #77 – Neural MT with Subword Units Using BPE-Dropout Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic The ability to translate subword units enables machine translation (MT) systems to translate rare words that might not appear in the training data used to build MT models. Ideally we don’t want to find these subword units (and their corresponding translated “segments”) as a preprocessing procedure, it would be much easier if we could recognise them directly, and automatically, […]

Read more

Issue #74 – Transfer Learning for Neural Machine Translation

20 Mar20 Issue #74 – Transfer Learning for Neural Machine Translation Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic Building machine translation (MT) for low-resource languages is a challenging task. This is especially true when training using neural MT (NMT) methods that require a comparatively larger corpus of parallel data. In this post, we review the work done by Zoph et al. (2016) on training NMT systems for low-resource languages using transfer learning. Transfer Learning The idea of transfer […]

Read more

Issue #73 – Mixed Multi-Head Self-Attention for Neural MT

12 Mar20 Issue #73 – Mixed Multi-Head Self-Attention for Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic Self-attention is a key component of the Transformer, a state-of-the-art neural machine translation architecture. In the Transformer, self-attention is divided into multiple heads to allow the system to independently attend to information from different representation subspaces. Recently it has been shown that some redundancy occurs in the multiple heads. In this post, we take a look at approaches which ensure […]

Read more

Issue #69 – Using Paraphrases in Multilingual Neural MT

13 Feb20 Issue #69 – Using Paraphrases in Multilingual Neural MT Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic Paraphrasing is common in human languages, as a way to talk about the same thing in different ways. There are many possible sentences that could be used to express the same meaning. From an MT perspective, we wanted to train systems that could not only translate sentences that bear similar meaning in one language into a sentence in another language, […]

Read more

Issue #68 – Incorporating BERT in Neural MT

07 Feb20 Issue #68 – Incorporating BERT in Neural MT Author: Raj Patel, Machine Translation Scientist @ Iconic BERT (Bidirectional Encoder Representations from Transformers) has shown impressive results in various Natural Language Processing (NLP) tasks. However, how to effectively apply BERT in Neural MT has not been fully explored. In general, BERT is used as fine-tuning for downstream NLP tasks. For Neural MT, a pre-trained BERT model is used to initialise the encoder in an encoder-decoder architecture. In this post we […]

Read more

Issue #67 – Unsupervised Adaptation of Neural MT with Iterative Back-Translation

30 Jan20 Issue #67 – Unsupervised Adaptation of Neural MT with Iterative Back-Translation Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic The most popular domain adaptation approach, when some in-domain data are available, is to fine-tune the training of the generic model with the in-domain corpus. When no parallel in-domain data are available, the most popular approach is back-translation, which consists of translating monolingual target in-domain data into the source language and use it as training corpus. In this […]

Read more
1 63 64 65 66 67 71