Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

Today, the Patronus team is excited to announce the new Enterprise Scenarios Leaderboard, built using the Hugging Face Leaderboard Template in collaboration with their teams.

The leaderboard aims to evaluate the performance of language models on real-world enterprise use cases. We currently support 6 diverse tasks – FinanceBench, Legal Confidentiality, Creative Writing, Customer Support Dialogue, Toxicity, and Enterprise PII.

We measure the performance of models on metrics like accuracy, engagingness, toxicity, relevance, and Enterprise PII.



Why do

 

 

 

To finish reading, please visit source site