Articles About Natural Language Processing

Issue #9 – Domain Adaptation for Neural MT

13 Sep18 Issue #9 – Domain Adaptation for Neural MT Author: Raj Nath Patel, Machine Translation Scientist @ Iconic While Neural MT has raised the bar in terms of the quality of general purpose machine translation, it is still limited when it comes to more intricate or technical use cases. That is where domain adaptation — the process of developing and adapting MT for specific industries, content types, and use cases — has a big part to play. In this […]

Read more

Issue #8 – Is Neural MT on par with human translation?

05 Sep18 Issue #8 – Is Neural MT on par with human translation? Author: Dr. John Tinsley, CEO @ Iconic The next few articles of the Neural MT Weekly will deal with the topic of quality and evaluation of machine translation. Since the advent of Neural MT, developments have moved fast, and we have seen quality expectation levels rise, in line with a number of striking proclamations about performance. Early claims of “bridging the gap between human and machine translation” […]

Read more

Issue #7 – Terminology in Neural MT

30 Aug18 Issue #7 – Terminology in Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic In many commercial MT use cases, being able to use custom terminology is a key requirement in terms of accuracy of the translation. The ability to guarantee the translation of specific input words and phrases is conveniently handled in Statistical MT (SMT) frameworks such as Moses. Because SMT is performed as a sequence of distinct steps, we can interject and specify directly […]

Read more

Issue #6 – Zero-Shot Neural MT

22 Aug18 Issue #6 – Zero-Shot Neural MT Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic As we covered in last week’s post, training a neural MT engine requires a lot of data, typically millions of sentences in both languages which are aligned at the sentence level, i.e. every sentence in the source (e.g. Spanish) has a corresponding target (e.g. English). During a typical training, the system looks at these bilingual sentence pairs and learns from it. The […]

Read more

Issue #5 – Creating training data for Neural MT

15 Aug18 Issue #5 – Creating training data for Neural MT Author: Prof. Andy Way, Deputy Director, ADAPT Research Centre This week, we have a guest post from Prof. Andy Way of the ADAPT Research Centre in Dublin. Andy leads a world-class team of researchers at ADAPT who are working at the very forefront of Neural MT. The post expands on the topic of training data – originally presented as one of the “6 Challenges in NMT” from Issue #4 […]

Read more

Issue #4 – Six Challenges in Neural MT

08 Aug18 Issue #4 – Six Challenges in Neural MT Author: Dr. John Tinsley, CEO @ Iconic A little over a year ago, Koehn and Knowles (2017) wrote a very appropriate paper entitled “Six Challenges in Neural Machine Translation” (in fact, there were 7 but only 6 were empirically tested). The paper set out a number of areas which, despite its rapid development, still needed to be addressed by researchers and developers of Neural MT. The seven challenges posed at […]

Read more

Issue #3 – Improving vocabulary coverage

01 Aug18 Issue #3 – Improving vocabulary coverage Author: Raj Nath Patel, Machine Translation Scientist @ Iconic Machine Translation typically operates with a fixed vocabulary, i.e. it knows how to translate a finite number of words. This is obviously an issue, because translation is an open vocabulary problem: we might want to translate any possible word! This is a particular issue for Neural MT where the vocabulary needs to be limited at the beginning for technical reasons. The problem is […]

Read more

Issue #2 – Data Cleaning for Neural MT

25 Jul18 Issue #2 – Data Cleaning for Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic “Garbage in, Garbage out” – noisy data is a big problem for all machine learning tasks, and MT is no different. By noisy data, we mean bad alignments, poor translations, misspellings, and other inconsistencies in the data used to train the systems. Statistical MT systems are more robust, and can cope with up to 10% noise in the training data without […]

Read more

Issue #1 – Scaling Neural MT

18 Jul18 Issue #1 – Scaling Neural MT Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic Training a neural machine translation engine is a time consuming task. It typically takes a number of days or even weeks, when running powerful GPUs. Reducing this time is a priority of any neural MT developer. In this post we explore a recent work (Ott et al, 2018), whereby, without compromising the translation quality, they speed up the training 4.9 times on […]

Read more
1 69 70 71