July 30, 2020 MT automatic evaluation, NLP

Issue #92 – The Importance of References in Evaluating MT Output

30 Jul20 Issue #92 – The Importance of References in Evaluating MT Output Author: Dr. Carla Parra Escartín, Global Program Manager @ Iconic Introduction Over the years, BLEU has become the “de facto standard” for Machine Translation automatic evaluation. However, and despite being the metric being referenced in all MT research papers, it is equally criticized for not providing a reliable evaluation of the MT output. In today’s blog post we look at the work done by Freitag et al. […]

July 23, 2020 NLP, Translating Translationese

Issue #91 – Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation

23 Jul20 Issue #91 – Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic Introduction Unsupervised Machine Translation (MT) is the technology that we use to train MT engines when parallel data is not used, at least not directly. We have discussed some interesting approaches in several previous posts for unsupervised MT (Issues #11 and #28) and some related topics (Issues #6, #25 and #66). Training MT engines requires the existence […]

July 16, 2020 automatic metrics in MT, NLP Leave a comment

Issue #90 – Tangled up in BLEU: Reevaluating how we evaluate automatic metrics in Machine Translation

16 Jul20 Issue #90 – Tangled up in BLEU: Reevaluating how we evaluate automatic metrics in Machine Translation Author: Dr. Karin Sim, Machine Translation Scientist @ Iconic Introduction Automatic metrics have a crucial role in Machine Translation (MT). They are used to tune the MT systems during the development phase, to determine which model is best, and to subsequently determine the accuracy of the final translations. Currently, the performance of these automatic metrics is judged by seeing how well they […]

July 9, 2020 NLP, Norm-Based Curriculum Learning Leave a comment

Issue #89 – Norm-Based Curriculum Learning for Neural Machine Translation

09 Jul20 Issue #89 – Norm-Based Curriculum Learning for Neural Machine Translation Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic Introduction Neural machine translation (NMT) models benefit from large amounts of data. However in high resource conditions, training these models is computationally expensive. In this post we take a look at a paper from Liu et al. (2020) aiming at improving the efficiency of training by introducing a curriculum learning method based on the word embedding norm. The […]

July 2, 2020 Multilingual Denoising Pre-training, NLP Leave a comment

Issue #88 – Multilingual Denoising Pre-training for Neural Machine Translation

02 Jul20 Issue #88 – Multilingual Denoising Pre-training for Neural Machine Translation Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic Introduction Pre-training has been used in many natural language processing (NLP) tasks with significant improvements in performance. In neural machine translation (NMT), pre-training is mostly applied to building blocks of the whole system, e.g. encoder or decoder. In a previous post (#70), we compared several approaches using pre-training with masked language models. In this post, we take a closer […]