Issue #136 – Neural Machine Translation without Embeddings

28 Jun21 Issue #136 – Neural Machine Translation without Embeddings Author: Dr. Jingyi Han, Machine Translation Scientist @ Language Weaver Introduction Nowadays, Byte Pair Encoding (BPE) has become one of the most commonly used tokenization strategies due to its universality and effectiveness in handling rare words. Although many previous works show that subword models with embedding layers in general achieve more stable and competitive results in neural machine translation (NMT), character-based (see issue #60) and Byte-based subword (see issue #64) […]

Issue #7 – Terminology in Neural MT

30 Aug18 Issue #7 – Terminology in Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic In many commercial MT use cases, being able to use custom terminology is a key requirement in terms of accuracy of the translation. The ability to guarantee the translation of specific input words and phrases is conveniently handled in Statistical MT (SMT) frameworks such as Moses. Because SMT is performed as a sequence of distinct steps, we can interject and specify directly […]