Skip to content

Deep Learning Daily

Deep Learning, NLP, NMT, AI, ML

  • Home
  • About
  • Privacy Policy
March 13, 2026 huggingface

Visual Document Retrieval Goes Multilingual

Marco Cimolai's avatar
Logan Markewich's avatar

TL;DR: We present vdr-2b-multi-v1, the best multilingual embedding model for visual document retrieval. We also release its English-only twin vdr-2b-v1 and open-source the new vdr-multilingual-train dataset. With 500k high-quality samples, it’s the largest open-source multilingual

 

 

 

To finish reading, please visit source site

Categories

Recent Posts

  • Porting fairseq wmt19 translation system to transformers
  • Leveraging Pre-trained Language Model Checkpoints for Encoder-Decoder Models
  • How we sped up transformer inference 100x for 🤗 API customers
  • Fit More and Train Faster With ZeRO via DeepSpeed and FairScale
  • Faster TensorFlow models in Hugging Face Transformers

Tags

Attention blogathon Calculus Command-line Tools Data Preparation data science data visualization Deep Learning Deep Learning for Computer Vision Deep Learning for Natural Language Processing Deep Learning for Time Series Deep Learning Performance Deep Learning with PyTorch Ensemble Learning Generative Adversarial Networks Imbalanced Classification Linear Algebra Long Short-Term Memory Networks machine learning Machine Learning Algorithms Machine Learning Process Machine Learning Resources machine translation Matplotlib Natural language processing Natural Language Processing & Speech Neural MT nlp NMT opencv Optimization pandas Probability python Python for Machine Learning Python Machine Learning Resources R Machine Learning scikit-learn sentiment analysis Start Machine Learning Statistics Time Series Weka Machine Learning XGBoost

Categories

Archives

Powered by WordPress and Rubine.