Finally, a Replacement for BERT

This blog post introduces ModernBERT, a family of state-of-the-art encoder-only models representing improvements over older generation encoders across the board, with a 8192 sequence length, better downstream performance and much faster processing.

ModernBERT is available as a slot-in replacement for any BERT-like models, with both a base (149M params) and large (395M params) model size.

Click to see how to use these models with transformers

ModernBERT will be included in v4.48.0 of transformers. Until then, it requires installing transformers from main:

pip install git+https://github.com/huggingface/transformers.git

Since ModernBERT is a Masked Language Model (MLM), you can use

 

 

 

To finish reading, please visit source site