Machine Translation Weekly 66: Means against ends of sentences

This week I am going to revisit the mystery of decoding in neural machine translation for one more time. It has been more than a year ago when Felix Stahlberg and Bill Byrne discovered the very disturbing feature of neural machine translation models – that the most probable target sentence is an empty sequence and this it is a sort of luck that we decode good translations from the models (MT Weekly 20). The paper disproved the narrative of NMT […]

Read more

Machine Translation Weekly 65: Sequence-to-sequence models and substitution ciphers

Today, I am going to talk about a recent pre-print on sequence-to-sequence models for deciphering substitution ciphers. Doing such a thing was somewhere at the bottom of my todo list for a few years, I suggested it as a thesis topic to several master students and no one wanted to do it, so I am glad that someone finally did the experiments. The title of the preprint is Can Sequence-to-Sequence Models Crack Substitution Ciphers? and the authors are from the […]

Read more

Machine Translation Weekly 64: Non-autoregressive Models Strike Back

Half a year ago I featured here (MT Weekly 45) a paper that questions the contribution of non-autoregressive models to computational efficiency. It showed that a model with a deep encoder (that can be parallelized) and a shallow decoder (that works sequentially) reaches the same speed with much better translation quality than NAR models. A pre-print by Facebook AI and CMU published on New Year’s Eve, Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade, presents a new fully non-autoregressive […]

Read more

Machine Translation Weekly 63: Maximum Aposteriori vs. Minimum Bayes Risk decoding

This week I will have a look at the best paper from this year’s COLING that brings an interesting view on inference in NMT models. The title of the paper is “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation” and its authors are from the University of Amsterdam. NMT models learn the conditional probability of the next word in a target sentence given the source sentence and the previous words in the target […]

Read more

Machine Translation Weekly 62: The EDITOR

Papers about new models for sequence-to-sequence modeling have always been my favorite genre. This week I will talk about a model called EDITOR that was introduced in a pre-print of a paper that will appear in the TACL journal with authors from the University of Maryland. The model is based on the Levenshtein Transformer, a partially non-autoregressive model for sequence-to-sequence learning. Autoregressive models generate the output left-to-right (or right-to-left), conditioning each step on the previously generated token. On the other […]

Read more

Machine Translation Weekly 61: Decoding and diversity

This week I will comment on a short paper from Carnegie Mellon University and Amazon that shows a simple analysis of the diversity of machine translation outputs. The title of the paper is Decoding and Diversity in Machine Translation and it will be presented at the Resistance AI Workshop at NeuRIPS 2020 (what a name for a workshop). The main thing that the paper shows that is the translation quality measured in terms of BLEU score strongly negatively correlates with […]

Read more

Machine Translation Weekly 60: Notes about WMT 2020 Shared Tasks

This week, I will follow up the last week’s post and comment on the news from this year’s WMT that was collocated with EMNLP. As every year, there were many shared tasks on various types of translation and evaluation of machine translation. News translation task The news translation task is the oldest task at WMT and sort of a flagship task providing benchmarks for MT research in the long term. Test sets are created by manually translating recent news stories […]

Read more

Machine Translation Weekly 59: Notes from EMNLP 2020

Another large NLP conference that must have taken place in a virtual environment, EMNLP 2020, is over, and here are my notes from the conference. The ACL in the summer that had most Q&A sessions on Zoom, which meant most of the authors waiting forever if someone takes the courage to enter the room. EMNLP sort of simulated the standard conference format that hopefully reduced the communication barrier. There were public Q&A sessions with short presentations and poster sessions in […]

Read more

Machine Translation Weekly 58: Poisoning machine translation

Today, I am going to talk about a topic that is rather unknown to me: the safety and vulnerability of machine translation. I will comment on a paper Targeted Poisoning Attacks on Black-Box Neural Machine Translation by authors from the University of Melbourne and Facebook AI. The main issue making machine-translation users vulnerable is that they typically do not understand the target language and do not have any other choice than trusting the system that target-language output is adequate. Most […]

Read more

Machine Translation Weekly 57: Document-level MT with Context Masking

This week, I am going to discuss the paper “Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation” by authors from Alibaba Group. The preprint of the paper appeared a month ago on arXiv and will be presented at this year’s EMNLP. Including document-level context into machine translation is one of the biggest challenges of current machine translation. It has several reasons. One is the lack of document-level training data, which is partially caused by […]

Read more
1 6 7 8 9 10