March 13, 2026 huggingface

Vision Language Model Alignment in TRL ⚡️

Vision Language Models (VLMs) are getting stronger, but aligning them to human preferences still matters. In TRL, we already showed how to post-train VLMs with Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). This time, we’re going further.

tl;dr Here’s what’s new in TRL:

Mixed Preference Optimization (MPO)
Group Relative Policy Optimization (GRPO)
Group Sequence Policy Optimization (GSPO) (a variant of GRPO)

These go beyond pairwise DPO, extracting richer signals from preference data and scaling better with modern VLMs.

We’ve also extended existing methods to support VLMs:

Reinforce Leave One Out (RLOO)
Online Direct Preference Optimization (Online

To finish reading, please visit source site

Categories
Categories

Search for:

Recent Posts

Porting fairseq wmt19 translation system to transformers

Leveraging Pre-trained Language Model Checkpoints for Encoder-Decoder Models

How we sped up transformer inference 100x for 🤗 API customers

Fit More and Train Faster With ZeRO via DeepSpeed and FairScale

Faster TensorFlow models in Hugging Face Transformers

Tags
Attention blogathon Calculus Command-line Tools Data Preparation data science data visualization Deep Learning Deep Learning for Computer Vision Deep Learning for Natural Language Processing Deep Learning for Time Series Deep Learning Performance Deep Learning with PyTorch Ensemble Learning Generative Adversarial Networks Imbalanced Classification Linear Algebra Long Short-Term Memory Networks machine learning Machine Learning Algorithms Machine Learning Process Machine Learning Resources machine translation Matplotlib Natural language processing Natural Language Processing & Speech Neural MT nlp NMT opencv Optimization pandas Probability python Python for Machine Learning Python Machine Learning Resources R Machine Learning scikit-learn sentiment analysis Start Machine Learning Statistics Time Series Weka Machine Learning XGBoost

Categories
Categories

Archives
Archives

Powered by WordPress and Rubine.