Neural Machine Translation

Lessons learned from analyzing values in multilingual encoders and what it means for LLMs

This post is a hindsight on two studies on multilingual sentence embeddings we published a year ago and comments on what I think people analyzing LLMs today should take away from them. In late 2022, we (which mainly was the work of Kathy Hämmerl from Munich and Björn Diesenroth and Patrick Schramowski from Darmstadt) finished a paper called Speaking Multiple Languages Affects the Moral Bias of Language Models (later published in Findings of ACL 2023) where we tried to compare […]

Read more

Highlights from Machine Translation and Multilinguality in May 2024

Here are short summaries of three pre-prints that I enjoyed reading in May. Zero-Shot Tokenizer Transfer Folks from the University of Cambridge and the Univerisity of Edinburgh propose a nice trick for changing the vocabulary of an already trained language model. They train a hyper-network (a neural network that predicts parameters of a different neural network) that predicts what embeddings a token would have if it were trained with the rest of the model. For each training batch, they build […]

Read more

Highlights from Machine Translation and Multilinguality in April 2024

Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation Folks from the University of the Basque Country prepared an English-Spanish dataset for natural langauge inference (i.e., deciding if sentences follow from each other, are in contradiction, or have nothing to do with each other) with metaphorical expressions. Unlike the standard version of this task (XNLI), which does not use figurative language, there is a large gap between in-language training and language transfer. (Transfer means that we finetune a multilingual […]

Read more

Highlights from Machine Translation and Multilinguality in March 2024

Did Translation Models Get More Robust Without Anyone Even Noticing? Folks from Lisbon study how robust the newest MT systems are against source-side noise. Machine translation using large models, including translation-specific NLLB or via LLMs (such as Tower or GPT-3.5), is much more robust both towards synthetic noise (the nice feature of synthetic noise is that you can check the translation quality for different noise levels) and also real-world noisy data from social networks. Tracing the Roots of Facts in […]

Read more

Highlights from Machine Translation and Multilinguality in February 2024

With a new month, here are a few papers that I noticed on arXiv in February. Linear-time Minimum Bayes Risk Decoding with Reference Aggregation A preprint from the University of Zurich proposes a linear time version of Minimum Bayes Risk (MBR) decoding in machine translation. This decoding algorithm does not aim to generate the most probable sequence given the model but the most typical one. This is typically done by sampling dozens of candidate output sentences, from which we select […]

Read more

Highlights from Machine Translation and Multilinguality in December 2023 and January 2024

Many things happened in the field in December: EMNLP, Google released Gemini, and Mixtral appeared. January was seemingly not that packed with new events, but plenty of new interesting work popped up on arXiv. Predicting Human Translation Difficulty with Neural Machine Translation Folks from the University of Melbourne found out that features from NMT, most notably the target sentence perplexity and something they call flow features, are a good predictor of human translation time. Turning English-centric LLMs Into Polyglots: How […]

Read more

Highlights from Machine Translation and Multilinguality in October 2023

Here is my monthly summary of what papers on multilinguality and machine translation I found the most noteworthy during October 2023. There were 2,881 preprints in the computation and language category on arXiv (a new record number), so there is a big chance that there were preprints I would like to read that I missed. Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models A preprint from Israeli Technion, Google Research, and Cambridge University studies cultural awareness […]

Read more

Highlights from Machine Translation and Multilinguality in November 2023

Here are a couple of articles that caught my attention in November. Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles A team from Johns Hopkins University published a pre-print that belongs to the currently trendy genre: stuff we can do with LLMs. This time, it is about how to use it efficiently for domain-specific machine translation. It is known that few-shot prompting works much better than zero-shot prompting, but you need to select proper parallel examples. […]

Read more

Highlights from Machine Translation and Multilinguality in summer 2023

Here are short summaries of the papers I liked the most during the (academic) summer. Also, this time, I am posting both on GitHub pages and on Medium. The preprint from the University of Würzburg presents a recipe for recycling existing models to create a multilingual vision-language model. They start with the English-only language model BLIP-2, which allows images to be a part of its input (the output is always textual). They take the image encoder from this model and […]

Read more

Highlights from Machine Translation and Multilinguality in June 2023

Here are the preprints that I found the most interesting in June 2023. Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers Folks from LORIA (a French research institute) and Posos (a French company) study the relationship between cross-lingual representation alignment and cross-lingual transfer. Here, alignment means what I would call language neutrality, i.e., that similar sentences should receive similar representation across languages. (Not alignment as the new word for finetuning language models to follow instructions, nor the […]

Read more
1 2 3 13