The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models
In the rapidly evolving field of Natural Language Processing (NLP), Large Language Models (LLMs) have become central to AI’s ability to understand and generate human language. However, a significant challenge that persists is their tendency to hallucinate — i.e., producing content that may not align with real-world facts or the user’s input. With the constant release of new open-source models, identifying the most reliable ones, particularly in terms of their propensity to generate hallucinated content, becomes crucial.
The Hallucinations Leaderboard aims to address this problem: it is a comprehensive platform that evaluates a wide array of LLMs against