Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases
Today, the Patronus team is excited to announce the new Enterprise Scenarios Leaderboard, built using the Hugging Face Leaderboard Template in collaboration with their teams.
The leaderboard aims to evaluate the performance of language models on real-world enterprise use cases. We currently support 6 diverse tasks – FinanceBench, Legal Confidentiality, Creative Writing, Customer Support Dialogue, Toxicity, and Enterprise PII.
We measure the performance of models on metrics like accuracy, engagingness, toxicity, relevance, and Enterprise PII.