Accelerating Hugging Face Transformers with AWS Inferentia2
In the last five years, Transformer models [1] have become the de facto standard for many machine learning (ML) tasks, such as natural language processing (NLP), computer vision (CV), speech, and more. Today, many data scientists and