Highlights from Machine Translation and Multilinguality in February 2023

There were plenty of interesting pre-prints on arXiv in February. Here is a
brief summary of three that I think are cool but could get lost in the hundreds
of papers that went public.

The unreasonable effectiveness of few-shot learning for machine translation

Folks from Google experimented with few-shot MT based on language-model.
Instead of using one of the cool huge language models we all know, they train
their smaller ones. They prepare specific bi- and tri-lingual LMs (8B
parameters; BERT has 110M, GPT-2 has 1.5B, GPT-3 175B). At inference time, they
retrieve 5 random examples from the train set and use them as a prompt to the
model. It works better than Google Translate and is comparable to the best WMT
submissions. However, it is hard to



To finish reading, please visit source site

Leave a Reply