Neural Machine Translation

Highlights from Machine Translation and Multilinguality in February 2024

With a new month, here are a few papers that I noticed on arXiv in February. Linear-time Minimum Bayes Risk Decoding with Reference Aggregation A preprint from the University of Zurich proposes a linear time version of Minimum Bayes Risk (MBR) decoding in machine translation. This decoding algorithm does not aim to generate the most probable sequence given the model but the most typical one. This is typically done by sampling dozens of candidate output sentences, from which we select […]

Read more

Highlights from Machine Translation and Multilinguality in December 2023 and January 2024

Many things happened in the field in December: EMNLP, Google released Gemini, and Mixtral appeared. January was seemingly not that packed with new events, but plenty of new interesting work popped up on arXiv. Predicting Human Translation Difficulty with Neural Machine Translation Folks from the University of Melbourne found out that features from NMT, most notably the target sentence perplexity and something they call flow features, are a good predictor of human translation time. Turning English-centric LLMs Into Polyglots: How […]

Read more

Highlights from Machine Translation and Multilinguality in October 2023

Here is my monthly summary of what papers on multilinguality and machine translation I found the most noteworthy during October 2023. There were 2,881 preprints in the computation and language category on arXiv (a new record number), so there is a big chance that there were preprints I would like to read that I missed. Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models A preprint from Israeli Technion, Google Research, and Cambridge University studies cultural awareness […]

Read more

Highlights from Machine Translation and Multilinguality in November 2023

Here are a couple of articles that caught my attention in November. Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles A team from Johns Hopkins University published a pre-print that belongs to the currently trendy genre: stuff we can do with LLMs. This time, it is about how to use it efficiently for domain-specific machine translation. It is known that few-shot prompting works much better than zero-shot prompting, but you need to select proper parallel examples. […]

Read more

Highlights from Machine Translation and Multilinguality in summer 2023

Here are short summaries of the papers I liked the most during the (academic) summer. Also, this time, I am posting both on GitHub pages and on Medium. The preprint from the University of Würzburg presents a recipe for recycling existing models to create a multilingual vision-language model. They start with the English-only language model BLIP-2, which allows images to be a part of its input (the output is always textual). They take the image encoder from this model and […]

Read more

Highlights from Machine Translation and Multilinguality in June 2023

Here are the preprints that I found the most interesting in June 2023. Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers Folks from LORIA (a French research institute) and Posos (a French company) study the relationship between cross-lingual representation alignment and cross-lingual transfer. Here, alignment means what I would call language neutrality, i.e., that similar sentences should receive similar representation across languages. (Not alignment as the new word for finetuning language models to follow instructions, nor the […]

Read more

Speeding up arXiv browsing

Staying up to date with the newest NLP work is a tough job, and reading about new research takes a significant amount of my time. For several years, one of my work routines has been skimming over the arXiv digest. I open a few preprints, glance over them, and write some notes into Zotero. Once a month, I write a blog post about what I think was the most interesting, which should force me to understand the papers, at least […]

Read more

Highlights from Machine Translation and Multilinguality in May 2023

Here are a few papers I found most interesting in the flood of new pre-prints on arXiv. There was ACL’s camera-ready deadline and the start of the EMNLP anonymity period, so there were many more papers than usual. What is the best recipe for character-level encoder-only modeling? A paper from DeepMind accepted to ACL 2023 systematically (and empirically) studies how to train a BERT-like model that works directly with character-level inputs using existing architectural building blocks. Transformers work well with […]

Read more

Highlights from Machine Translation and Multilinguality in April 2023

Here is my monthly summray of what new papers and preprints are liked the most during the previous month. Several institutions in China did a thorough evaluation of how large language models work for machine translation One might think yet another paper like this, but this one is much better than what Tencent did with ChatGPT and just a few tests sentences. This paper uses the Flores 101 test set, a pretty standard large test for 101 languages. Everything is […]

Read more

Few words on Natural Language Processing and User Autonomy

As natural language processing (NLP) finds its way from university labs and becomes a crucial element of many user-facing technologies (machine translation, search, language-model-based assistants), people start to get concerned about the ethics of this technology. When people talk about NLP ethics, the main topics are: biases that the models get from training data, replication of toxic behavior found on the Internet, underrepresentation of already underprivileged groups, differences between the technology availability between the global north and global south. Now, […]

Read more
1 2 3 13