Finally, a Replacement for BERT
This blog post introduces ModernBERT, a family of state-of-the-art encoder-only models representing improvements over older generation encoders across the board, with a 8192 sequence length, better downstream performance and much faster processing.
ModernBERT is available as a slot-in replacement for any BERT-like models, with both a base (149M params) and large (395M params) model size.
Click to see how to use these models with transformers
ModernBERT will be included in v4.48.0 of transformers. Until then, it requires installing transformers from main:
pip install git+https://github.com/huggingface/transformers.git
Since ModernBERT is a Masked Language Model (MLM), you can use