Introducing HELMET: Holistically Evaluating Long-context Language Models
Contact: hyen@cs.princeton.edu
Paper: https://arxiv.org/abs/2410.02694
Website: https://princeton-nlp.github.io/HELMET
Code & Data: https://github.com/princeton-nlp/HELMET
Since we first released HELMET last October, there has been more development on long-context language models than ever before, and we are thrilled to see the adoption of HELMET by the community, such as Microsoft’s Phi-4 and AI21’s Jamba 1.6.
After the initial release, we have added more models to our evaluation suite and conducted additional analyses. We are excited to share our new results and present HELMET at ICLR 2025!
In this blog, we will describe