Machine Translation Weekly 73: Non-autoregressive MT with Latent Codes

Today, I will comment on a paper on non-autoregressive machine translation that
shows a neat trick for increasing output fluency. The title of the paper is
Non-Autoregressive Translation by Learning Target Categorical
Codes
, has authors from several Chinese
private and public institutions and will appear at this year’s NAACL
Conference
.

Unlike standard, so-called autoregressive encoder-decoder architectures that
decode output sequentially (and in theory in linear time), non-autoregressive
models generate all outputs in parallel (and in theory in constant time,
regardless of the input length). This leads to significant speedups, but
typically at the expense of output fluency and overall translation quality. The
output tokens are modeled as conditionally independent, so the tokens on the
right are not aware of what was previously decoded which can lead to
inconsistencies.

 

 

To finish reading, please visit source site

Leave a Reply