Machine Translation Weekly 91: Zero-Shot Machine Translation with a Universal Encoder from Pre-trained Representations

How many times have you heard someone saying that multilingual BERT or similar models could be used as a universal encoder in machine translation? I heard that (and said that) many times, but never heard about someone who actually did that, until now. Folks from The University of Hong Kong, Mircosoft Research, Shanghai University, and Texas A&M University published their preprint on this topic last Thursday on arXiv. The title of the paper is Towards Making the Most of Multilingual […]

Read more

Machine Translation Weekly 90: The Surprising Multinguality of Large Language Models

This week, I am going to share my amazement and doubts about what could be called the surprising multilinguality of large language models. By large language models, I mean the really large ones that I can hardly run myself, trained on huge, hardly curated data and thus harbouring the worst societal demons, but also having many fascinating properties. Here, I would like to feature three papers that make me think about the properties of the models. 1. Finetuning to other […]

Read more

Machine Translation Weekly 89: BPE and Memorization

Similar to last week, I will discuss a paper about input segmentation. The paper is not directly about machine translation or multilinguality but brings interesting insights for Transformer models in general. The title of the paper is How BPE affects memorization in Transformers, it has authors from Facebook AI and the preprint appeared on Thursday on arXiv. The paper presents a series of experiments with Transformer models for natural language inferences and different sizes of BPE-based vocabulary by which they […]

Read more

Machine Translation Weekly 88: Text Segmentation and Multilinguality

With the semester start, it is also time to renew MT Weekly. My new year’s resolution was to make it to 100 issues, so let’s see if I can keep it. Today, I will talk about a paper by my colleagues from LMU Munich that will appear in the Findings of EMNLP 2021 which deals with a perpetual problem of NLP – input text segmentation. The title of the paper is Wine is Not v i n. On the Compatibility […]

Read more

Machine Translation Weekly 87: Notes from ACL 2021

The story of the science fiction novel Roadside Picnic by Arkady and Boris Strugatsky (mostly known via Tarkovsky’s 1979 film Stalker) takes place after an extraterrestrial event called the Visitation. Some aliens stopped by, made a roadside picnic, and left behind plenty of weird and dangerous objects having features that contemporary science cannot explain. Although the UN tries to prevent people from entering the visitation zones before everything gets fully explored and explained, objects from the zone are traded on […]

Read more

Machine Translation Weekly 86: The Wisdom of the WMT Crowd

Most of the papers that I comment on and review here present novel and cool ideas on how to improve something in machine translation or multilingual NLP. On the other hand, the WMT submissions are different. People want to get the best translation quality and value efficiency, and simplicity. Novelty and prettiness of the ideas are secondary. WMT organizes annual competitions in machine translation quality (and other tasks related to machine translation) where dozens of companies and universities participate. Each […]

Read more

Machine Translation Weekly 85: The Incredibility of MT Evaluation

This week, I will comment on a paper that quantifies and exactly measures the dimensions of the elephant in the room of machine translation: the lack of empirical support of claims in research papers on machine translation. The title of the paper is Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers, it has awill appear at this year’s ACL.uthors from NICT in Japan and was awarded as an oustanding paper ACL 2021. The authors manually annotated an […]

Read more

Machine Translation Weekly 84: Order Agnostic Cross-Entropy

I tend to be a little biased against autoregressive models. The way they operate: say exactly one subword, think for a while, and then say again exactly one subword, just does not sound natural to me. Moreover, with current models, a subword can be anything from a single character to a word as long as “Ausgußreiniger”. Non-autoregressive models generate everything in a single step. That does seem to be really natural either, but at least they offer an interesting alternative. […]

Read more

Machine Translation Weekly 83: On Language Indentity and Zero-Shot Transfer

This week I will comment on two papers on zero-shot cross-lingual model transfer which do not focus on the representation quality but on the transfer itself. The title of the first one is Language Embeddings for Typology and Cross-lingual Transfer Learning and has authors from UC Davis. The second is Syntax-augmented Multilingual BERT for Cross-lingual Transfer and has authors from UC LA and Facebook AI. Both papers will appear at this year’s ACL. Just a reminder, zero-shot model transfer means […]

Read more

Machine Translation Weekly 82: Multimodal Translation and the Visual Context

This week I am going to discuss (and criticize) a paper on multimodal machine translation that attempts to once again evaluate if and how the visual information could be useful in machine translation. The title of the paper is Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation, it has authors from several institutions in China and Hong Kong and will appear at this year’s ACL. Multimodal machine translation (also, a topic […]

Read more
1 2 3 6