The NLP Cypher | 10.03.21
RAFT is a few-shot classification benchmark that tests language models: – across multiple domains (lit reviews, medical data, tweets, customer interaction, etc.) – on economically valuable classification tasks (someone inherently cares about the task) – with evaluation that mirrors deployment (50 labeled examples per task, info retrieval allowed, hidden test set)
Read more