Machine Translation Weekly 95: Minimum Bayes Risk Decoding – the Cooler the Metric, the Cooler it gets

This week I am returning to a topic that I follow with fascination (cf. MT Weekly #20, #61, #63, and #66) without actually doing any research myself – decoding in machine learning models. The preprint I will discuss today comes from Google Research and has the title Minimum Bayes Risk Decoding with Neural Metrics of Translation Quality. It shows that Minimum Bayes Risk (MBR) decoding can outperform beam search when done properly and that there might be some serious problems […]

Read more

Machine Translation Weekly 94: Notes from WMT 2021

After the notes from EMNLP 2021, here is also an unsorted list of some observations from the Conference on Machine Translation. Facebook AI won in many translation directions (not at all in all of them) in the news task with a multilingual system. At the panel discussion about MT evaluation, Herman Nay expressed a controversial opinion: it does not matter what metric we use, the history of MT would be the same with any metric (that at least slightly correlates […]

Read more

Machine Translation Weekly 93: Notes from EMNLP 2021

Another big NLP conference is over and here are my notes about the paper that I liked the most. My general impression was sort of similar to what I got from ACL this year. It seems to me that the field is progressing towards some behavioral understanding of what the neural models do, which allows doing some cool tricks that it was hardly possible to think of, only a few years ago. Excellent examples are tricks with adapters or non-parametric […]

Read more

Machine Translation Weekly 92: Multilingual Machine Translation with Plug-and-Play Embeddings

Deep learning models are prone to so-called catastrophic forgetting when finetuned on slightly different data than they were originally trained on. Often, they also badly generalize when confronted with data that do not exactly look like those they were trained on. On the other hand, there are more and more tricks on how to just reuse a part of a model that was trained for something else and it just works. I could hardly believe when Tom Kocmi and Ondřej […]

Read more

Machine Translation Weekly 91: Zero-Shot Machine Translation with a Universal Encoder from Pre-trained Representations

How many times have you heard someone saying that multilingual BERT or similar models could be used as a universal encoder in machine translation? I heard that (and said that) many times, but never heard about someone who actually did that, until now. Folks from The University of Hong Kong, Mircosoft Research, Shanghai University, and Texas A&M University published their preprint on this topic last Thursday on arXiv. The title of the paper is Towards Making the Most of Multilingual […]

Read more

Machine Translation Weekly 90: The Surprising Multinguality of Large Language Models

This week, I am going to share my amazement and doubts about what could be called the surprising multilinguality of large language models. By large language models, I mean the really large ones that I can hardly run myself, trained on huge, hardly curated data and thus harbouring the worst societal demons, but also having many fascinating properties. Here, I would like to feature three papers that make me think about the properties of the models. 1. Finetuning to other […]

Read more

Machine Translation Weekly 89: BPE and Memorization

Similar to last week, I will discuss a paper about input segmentation. The paper is not directly about machine translation or multilinguality but brings interesting insights for Transformer models in general. The title of the paper is How BPE affects memorization in Transformers, it has authors from Facebook AI and the preprint appeared on Thursday on arXiv. The paper presents a series of experiments with Transformer models for natural language inferences and different sizes of BPE-based vocabulary by which they […]

Read more

Machine Translation Weekly 88: Text Segmentation and Multilinguality

With the semester start, it is also time to renew MT Weekly. My new year’s resolution was to make it to 100 issues, so let’s see if I can keep it. Today, I will talk about a paper by my colleagues from LMU Munich that will appear in the Findings of EMNLP 2021 which deals with a perpetual problem of NLP – input text segmentation. The title of the paper is Wine is Not v i n. On the Compatibility […]

Read more

Machine Translation Weekly 87: Notes from ACL 2021

The story of the science fiction novel Roadside Picnic by Arkady and Boris Strugatsky (mostly known via Tarkovsky’s 1979 film Stalker) takes place after an extraterrestrial event called the Visitation. Some aliens stopped by, made a roadside picnic, and left behind plenty of weird and dangerous objects having features that contemporary science cannot explain. Although the UN tries to prevent people from entering the visitation zones before everything gets fully explored and explained, objects from the zone are traded on […]

Read more

Machine Translation Weekly 86: The Wisdom of the WMT Crowd

Most of the papers that I comment on and review here present novel and cool ideas on how to improve something in machine translation or multilingual NLP. On the other hand, the WMT submissions are different. People want to get the best translation quality and value efficiency, and simplicity. Novelty and prettiness of the ideas are secondary. WMT organizes annual competitions in machine translation quality (and other tasks related to machine translation) where dozens of companies and universities participate. Each […]

Read more
1 3 4 5 6 7 10