A multilingual version of MS MARCO passage ranking dataset

A multilingual version of MS MARCO passage ranking dataset
This repository presents a neural machine translation-based method for translating the MS MARCO passage ranking dataset. The code available here is the same used in our paper mMARCO: A Multilingual Version of MS MARCO Passage Ranking Dataset.
Translated Datasets
As described in our work, we made available 8 translated versions of MS MARCO passage ranking dataset. The translated passages collection and the queries set (training and validation) are available at:
Released Model Checkpoints
Our available fine-tuned models are:
* [email protected] on English MS MARCO
Dataset
We translate MS MARCO passage ranking dataset, a large-scale IR dataset comprising