Articles About Natural Language Processing

Issue #66 – Neural Machine Translation Strategies for Low-Resource Languages

23 Jan20 Issue #66 – Neural Machine Translation Strategies for Low-Resource Languages This week we are pleased to welcome the newest member to our scientific team, Dr. Chao-Hong Liu. In this, his first post with us, he’ll give his views on two specific MT strategies, namely, pivot MT and zero-shot MT. While we have covered these topics in previous ‘Neural MT Weekly’ blog posts (Issue #54, Issue #40), these are topics that Chao-Hong has recently worked on prior to joining […]

Read more

Issue #64 – Neural Machine Translation with Byte-Level Subwords

13 Dec19 Issue #64 – Neural Machine Translation with Byte-Level Subwords Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic In order to limit vocabulary, most neural machine translation engines are based on subwords. In some settings, character-based systems are even better (see issue #60). However, rare characters in noisy data or character-based languages can unnecessarily take up vocabulary slots and limit its compactness. In this post we take a look at an alternative, proposed by Wang et al. (2019), […]

Read more

Issue #63 – Neuron Interaction Based Representation Composition for Neural Machine Translation

05 Dec19 Issue #63 – Neuron Interaction Based Representation Composition for Neural Machine Translation Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic Transformer models are state of the art in Neural Machine Translation. In this blog post, we will take a look at a recently proposed approach by Li et al (2019) which further improves upon the transformer model by modeling more neuron interactions. Li et al (2019) claim that their approach models better encoder representation and captures semantic […]

Read more

Issue #62 – Domain Differential Adaptation for Neural MT

28 Nov19 Issue #62 – Domain Differential Adaptation for Neural MT Author: Raj Patel, Machine Translation Scientist @ Iconic Neural MT models are data hungry and domain sensitive, and it is nearly impossible to obtain a good amount ( >1M segments) of training data for every domain we are interested in. One common strategy is to align the statistics of the source and target domain, but the drawback of this approach is that the statistics of the different domains are inherently […]

Read more

Issue #61 – Context-Aware Monolingual Repair for Neural Machine Translation

21 Nov19 Issue #61 – Context-Aware Monolingual Repair for Neural Machine Translation Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic In issue #15 and issue #39 we looked at various approaches for document level translation. In this blog post, we will look at another approach proposed by Voita et. al (2019a) to capture context information. This approach is unique in the sense that it utilizes only target monolingual data to improve the discourse phenomenon  (deixis, ellipsis, lexical cohesion, ambiguity, […]

Read more

Issue #60 – Character-based Neural Machine Translation with Transformers

14 Nov19 Issue #60 – Character-based Neural Machine Translation with Transformers Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic We saw in issue #12 of this blog how character-based recurrent neural networks (RNNs) could outperform (sub)word-based models if the network is deep enough. However, character sequences are much longer than subword ones, which is not easy to deal with in  RNNs. In this post, we discuss how the Transformer architecture changes the situation for character-based models. We take a […]

Read more

Issue #58 – Quantisation of Neural Machine Translation models

31 Oct19 Issue #58 – Quantisation of Neural Machine Translation models Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic When large amounts of training data are available, the quality of Neural MT engines increases with the size of the model. However, larger models imply decoding with more parameters, which makes the engine slower at test time. Improving the trade-off between model compactness and translation quality is an active research topic. One of the ways to achieve more compact models […]

Read more

Issue #57 – Simple and Effective Noisy Channel Modeling for Neural MT

24 Oct19 Issue #57 – Simple and Effective Noisy Channel Modeling for Neural MT Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic Neural MT is widely used today and the results are undeniably better compared to the statistical machine translation (SMT) used earlier. One of the core components of an SMT system was the language model. In this post, we will look at how we can benefit from a language model in Neural MT, too. In particular, we will […]

Read more

Issue #56 – Scalable Adaptation for Neural Machine Translation

17 Oct19 Issue #56 – Scalable Adaptation for Neural Machine Translation Author: Raj Patel, Machine Translation Scientist @ Iconic Although current research has explored numerous approaches for adapting Neural MT engines to different languages and domains, fine-tuning remains the most common approach. In fine-tuning, the parameters of a pre-trained model are updated for the target language or domain in question. However, fine-tuning requires training and maintenance of a separate model for each target task (i.e. a separate MT engine for every […]

Read more

Issue #55 – Word Alignment from Neural Machine Translation

10 Oct19 Issue #55 – Word Alignment from Neural Machine Translation Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic Word alignments were the cornerstone of all previous approaches to statistical MT. You take your parallel corpus, align the words, and build from there. In Neural MT however, word alignment is no longer needed as an input of the system. That being said, research is coming back around to the idea that it remains useful in real-world practical scenarios for […]

Read more
1 64 65 66 67 68 71