Issue #87 – YiSi – A Unified Semantic MT Quality Evaluation and Estimation Metric

25 Jun20 Issue #87 – YiSi – A Unified Semantic MT Quality Evaluation and Estimation Metric Author: Dr. Karin Sim, Machine Translation Scientist @ Iconic Introduction Automatic evaluation is an issue that has long troubled machine translation (MT): how do we evaluate how good the MT output is? Traditionally, BLEU has been the “go to”, as it is simple to use across language pairs. However, it is overly simplistic, evaluating string matches to a single reference translation. More sophisticated metrics […]

Read more

Issue #86 – Neural MT with Levenshtein Transformer

18 Jun20 Issue #86 – Neural MT with Levenshtein Transformer Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic Introduction The standard Transformer model is autoregressive, meaning that the prediction of each target word is based on the predictions for the previous words. The output is generated from left to right, with no chance to revise a past decision and without considering future predictions of the words on the right of the current word. In a recent post (#82), […]

Read more

Issue #85 – Applying Terminology Constraints in Neural MT

11 Jun20 Issue #85 – Applying Terminology Constraints in Neural MT Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic Introduction Maintaining consistency of terminology translation in Neural Machine Translation (NMT) is a more challenging task than in Statistical MT (SMT). In this post, we review a method proposed by Dinu et al. (2019) to train NMT to use custom terminology. Translation with Terminology Constraints Applying terminology constraints to translation may appear to be an easy task. It is a […]

Read more

Issue #84 – Are Neural Machine Translation Systems Good Estimators of Quality?

04 Jun20 Issue #84 – Are Neural Machine Translation Systems Good Estimators of Quality? Author: Prof. Lucia Specia, Professor of Natural Language Processing, Imperial College London (also to ADAPT/Dublin City University and University of Sheffield) This week, we are delighted to have a guest post from Prof. Lucia Specia of Imperial College London, and laterally the University of Sheffield and our own alma mater, Dublin City University. Prof. Specia is one of the world’s preeminent experts on the topic of […]

Read more

Issue #83 – Selective Attention for Context-aware Neural Machine Translation

21 May20 Issue #83 – Selective Attention for Context-aware Neural Machine Translation Author: Dr. Karin Sim, Machine Translation Scientist @ Iconic Introduction One of the next frontiers for Neural Machine Translation (NMT) is moving beyond the sentence-by-sentence translation that currently is the norm, to a context-aware, document level translation. Including extra-sentential context means that discourse elements (such as expressions referring back to previously-mentioned entities) can be integrated, resulting in better translation of references, for example. Currently the engine has no […]

Read more

Issue #82 – Constrained Decoding using Levenshtein Transformer

14 May20 Issue #82 – Constrained Decoding using Levenshtein Transformer Author: Raj Patel, Machine Translation Scientist @ Iconic Introduction In constrained decoding, we force in-domain terminology to appear in the final translation. We have previously discussed constrained decoding in earlier blog posts (#7, #9, #79). In this blog post, we will discuss a simple and effective algorithm for incorporating lexical constraints in Neural Machine Translation (NMT) proposed by Susanto et al. (2020) and try to understand how it is better than […]

Read more

Issue #81 – Evaluating Human-Machine Parity in Language Translation: part 2

07 May20 Issue #81 – Evaluating Human-Machine Parity in Language Translation: part 2 Author: Dr. Sheila Castilho, Post-Doctoral Researcher @ ADAPT Research Centre This is the second in a 2-part post addressing machine translation quality evaluation – an overarching topic regardless of the underlying algorithms. Following our own summary last week, this week we are delighted to have one of the paper’s authors, Dr. Sheila Castilho, give her take on the paper, their motivations for writing it, and where we […]

Read more

Issue #79 -Merging Terminology into Neural Machine Translation

23 Apr20 Issue #79 -Merging Terminology into Neural Machine Translation Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic After several years being the state of the art in Machine Translation, neural MT still doesn’t have a convenient way to enforce the translation of custom terms according to a glossary. In issue #7, we reviewed several approaches to handle terminology in neural MT. Just adding the glossary to the training data is not effective. Replacing the source term by a […]

Read more

Issue #78 – Balancing Training data for Multilingual Neural MT

16 Apr20 Issue #78 – Balancing Training data for Multilingual Neural MT Author: Raj Patel, Machine Translation Scientist @ Iconic Multilingual Neural MT (MNMT) can translate to/from multiple languages, but in model training we are faced with imbalanced training sets. This means that some languages have much more training data compared to others. In general, we up-sample the low resource languages to balance the representation. However, the degree of up-sampling has a large effect on the overall performance of the model. […]

Read more

Issue #77 – Neural MT with Subword Units Using BPE-Dropout

09 Apr20 Issue #77 – Neural MT with Subword Units Using BPE-Dropout Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic The ability to translate subword units enables machine translation (MT) systems to translate rare words that might not appear in the training data used to build MT models. Ideally we don’t want to find these subword units (and their corresponding translated “segments”) as a preprocessing procedure, it would be much easier if we could recognise them directly, and automatically, […]

Read more
1 963 964 965 966 967 972