Issue #15 – Document-Level Neural MT

01 Nov18

Issue #15 – Document-Level Neural MT

Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic

In this week’s post, we take a look at document-level neural machine translation. Most, if not all existing approaches to machine translation operate on the sentence level. That is to say, when translating a document, it is actually split up into individual sentences or segments, and they are processed independently of each other. With document-level Neural MT, as the name suggests, we are going beyond sentence level translation, to take into account some surrounding sentences and context during the translation of any particular sentence.

Why do we need document level MT?

The benefits of document level translations are clear. If we can take the whole document into account, like human translators do, we will get better coherence, cohesion, consistency and terminology selection. Currently, even when we send a document or paragraph to an MT system, it is internally split into individual sentences and these sentences are translated independently. When we translate sentences without looking at the surrounding text we lose information regarding the context and the ability to resolve ambiguous cases. For example, the same pronoun in the source language
To finish reading, please visit source site

Leave a Reply