TextQuests: How Good are LLMs at Text-Based Video Games?

Long Phan's avatar
Clémentine Fourrier's avatar

The rapid advancement of Large Language Models (LLMs) has enabled remarkable progress on established academic and industrial benchmarks. Knowledge benchmarks, such as MMLU and GPQA, are now largely saturated, and frontier models are making significant progress

 

 

 

To finish reading, please visit source site