Neural Machine Translation

Highlights from Machine Translation and Multilinguality in March 2025

EuroBERT: Scaling Multilingual Encoders for European Languages A large group of authors, mostly from CentraleSupélec in Paris and Instituto Técnico in Lisbon, released EuroBERT, a multilingual BERT model for European and major global languages. There is also a 2.1 B version, unusually large for encoder models. High-Dimensional Interlingual Representations of Large Language Models A print from the Hong Kong University of Science and Technology evaluates the sentence-level similarity of LLM hidden states across languages. It shows that the idea that […]

Highlights from Machine Translation and Multilinguality in February 2025

WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects Folks from Google and Unbable extended the WMT24 test sets from 8 to 55 languages by adding more human references. They evaluated LLMs and commercial MT services on them. The winner is OpenAI’s o1, followed by Claude and Gemini. The best open-source model is Unbabel’s Tower, which outperforms all standard commercial translation services (Google Translate, DeepL, and Microsoft Translator). SMOL: Professionally translated parallel data for 115 under-represented languages […]

February 7, 2025 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in December 2024 and January 2025

MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost Researchers from Tsinghua, Shanghai, Beijing, Hong Kong, and Johns Hopkins have developed a method for adapting diffusion models to hundreds of languages at a minimal cost. They achieve this by swapping the text encoder with a multilingual one and training it to produce representations consistent with the CLIP encoder, leveraging parallel language data and English image-text data. The results look impressive and multilingual, and the generation quality, as […]

December 5, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in November 2024

Mitigating Metric Bias in Minimum Bayes Risk Decoding Minimum Bayes Risk Decoding tries to get the most typical output from a language or machine translation model rather than the most probable one. The main idea is that the probability scores do not consider how semantically similar sentences are. Therefore, the most probable sequence might not be the most typical from a meaning perspective. The weak point is that we have to decide what metric to use to estimate the similarity, […]

November 21, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Notes from EMNLP 2024

Last week, I was at EMNLP in Miami, and here are a few notes about what I saw at the conference. Keynotes The conference had three keynotes: two good and one amazing. In the first keynote, Percy Liang talked about research on LLMs that they do at Stanford. One topic was LLM-based agents: Percy Liang predicts that LLMs are awaiting their AlphaGo moment so that we will move from coded agents; soon, the big topic will be agents trained with […]

November 5, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in October 2024

Here are summaries of a few pre-preprints that I noticed on arXiv during October. LangSAMP: Language-Script Aware Multilingual Pretraining Folks from LMU Munich try a relatively simple trick to improve multilingual encoder models, particularly non-Latin-script and low-resource languages. They use additional information about the language identity and the script, but only during training, so at the inference, we can still use the model without caring about what language we feed in. They add static language and script embeddings before the […]

October 7, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in Summer 2024

Here are summaries of a few papers that I liked during the (long academic) summer. BertaQA: How Much Do Language Models Know About Local Culture? People from the University of the Basque Country prepared a QA dataset consisting of local knowledge about the Basque Country, hopefully including facts that might now exist on the English-speaking Internet and contrast that with global (but it probably means Western) facts. The questions are in the multiple-choice style. Then, they asked professional translators to […]

July 23, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Lessons learned from analyzing values in multilingual encoders and what it means for LLMs

This post is a hindsight on two studies on multilingual sentence embeddings we published a year ago and comments on what I think people analyzing LLMs today should take away from them. In late 2022, we (which mainly was the work of Kathy Hämmerl from Munich and Björn Diesenroth and Patrick Schramowski from Darmstadt) finished a paper called Speaking Multiple Languages Affects the Moral Bias of Language Models (later published in Findings of ACL 2023) where we tried to compare […]

June 5, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in May 2024

Here are short summaries of three pre-prints that I enjoyed reading in May. Zero-Shot Tokenizer Transfer Folks from the University of Cambridge and the Univerisity of Edinburgh propose a nice trick for changing the vocabulary of an already trained language model. They train a hyper-network (a neural network that predicts parameters of a different neural network) that predicts what embeddings a token would have if it were trained with the rest of the model. For each training batch, they build […]

May 5, 2024 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in April 2024

Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation Folks from the University of the Basque Country prepared an English-Spanish dataset for natural langauge inference (i.e., deciding if sentences follow from each other, are in contradiction, or have nothing to do with each other) with metaphorical expressions. Unlike the standard version of this task (XNLI), which does not use figurative language, there is a large gap between in-language training and language transfer. (Transfer means that we finetune a multilingual […]

1 2 3 … 14 »