Machine Translation Weekly 85: The Incredibility of MT Evaluation

This week, I will comment on a paper that quantifies and exactly measures the dimensions of the elephant in the room of machine translation: the lack of empirical support of claims in research papers on machine translation. The title of the paper is Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers, it has awill appear at this year’s ACL.uthors from NICT in Japan and was awarded as an oustanding paper ACL 2021. The authors manually annotated an […]

Read more

Machine Translation Weekly 84: Order Agnostic Cross-Entropy

I tend to be a little biased against autoregressive models. The way they operate: say exactly one subword, think for a while, and then say again exactly one subword, just does not sound natural to me. Moreover, with current models, a subword can be anything from a single character to a word as long as “Ausgußreiniger”. Non-autoregressive models generate everything in a single step. That does seem to be really natural either, but at least they offer an interesting alternative. […]

Read more

Machine Translation Weekly 83: On Language Indentity and Zero-Shot Transfer

This week I will comment on two papers on zero-shot cross-lingual model transfer which do not focus on the representation quality but on the transfer itself. The title of the first one is Language Embeddings for Typology and Cross-lingual Transfer Learning and has authors from UC Davis. The second is Syntax-augmented Multilingual BERT for Cross-lingual Transfer and has authors from UC LA and Facebook AI. Both papers will appear at this year’s ACL. Just a reminder, zero-shot model transfer means […]

Read more

Machine Translation Weekly 82: Multimodal Translation and the Visual Context

This week I am going to discuss (and criticize) a paper on multimodal machine translation that attempts to once again evaluate if and how the visual information could be useful in machine translation. The title of the paper is Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation, it has authors from several institutions in China and Hong Kong and will appear at this year’s ACL. Multimodal machine translation (also, a topic […]

Read more

Machine Translation Weekly 81: Unsupervsied MT and Parallel Sentence Mining

This week I am going to briefly comment on a paper that uses unsupervised machine translation to improve unsupervised scoring for parallel data mining. The title of the paper is Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining, it has authors from Charles University and the University of the Basque Country and will appear at this year’s ACL student research workshop. The idea of the paper is quite simple. They took XLM, a BERT-like model that was trained for 100 […]

Read more

Machine Translation Weekly 80: Deontological ethics and MT

At this year’s NAACL, there will be a paper that tries to view NLP from the perspective of deontological ethics and promotes an unusual and very insightful view on NLP ethics. The title of the paper is Case Study: Deontological Ethics in NLP, it was written by authors from CMU and discusses several NLP applications from the perspective of deontological ethics. Usually, ethics in NLP is discussed from the consequentialist perspective. In this view, the morality of an action is […]

Read more

My most amazing Makefile for CL papers

Automation of stuff that does not need to be automated at all is one of my most favorite procrastination activities. As an experienced (and most of the time unsuccessful) submitter to conferences organized by ACL (ACL, NAACL, EACL, EMNLP), I spent a lot of procrastinating time improving the Makefile compiling the papers. Here are few commented snippets from the Makefiles. Hopefully, someone finds that useful. The normal LaTeX stuff I compile the paper using latexmk. main.pdf: $(FILES) latexmk -pdflatex=”$(LATEX) %O […]

Read more

Machine Translation Weekly 79: More context in MT

The lack of broader context is one of the main problems in machine translation and in NLP in general. People tried various methods with actually quite mixed results. A recent preprint from Unbabel introduces an unusual quantification of context-awareness and based on that do some training improvements. The title of the paper is Measuring and Increasing Context Usage in Context-Aware Machine Translation and will be presented at ACL 2021. The paper measures how well informed the model is about the […]

Read more

Machine Translation Weekly 78: Multilingual Hate Speech Detection

This week I will comment on a preprint Cross-lingual hate speech detection based on multilingual domain-specific word embeddings by authors from the University of Chile. The pre-print evaluates the possibility of cross-lingual transfer of models for hate speech detection, i.e., training a model in one language and testing it in a different language. Hate speech detection is a particularly tough task for model transfer because many of the words have a different meaning or at least different connotations when used […]

Read more

Machine Translation Weekly 77: Reference-free Evaluation

This week, I am will comment on a paper by authors from the University of Maryland and Google Research on reference-free evaluation of machine translation, which seems quite disturbing to me and suggests there is a lot about current MT models we still don’t quite understand. The title of the paper is “Assessing Reference-Free Peer Evaluation for Machine Translation” and it will be published at this year’s NAACL conference. The standard evaluation of machine translation uses reference translations: translations that […]

Read more
1 4 5 6 7 8 10