Why we’re switching to Hugging Face Inference Endpoints, and maybe you should too

Hugging Face recently launched Inference Endpoints; which as they put it: solves transformers in production. Inference Endpoints is a managed service that allows you to:

Deploy (almost) any model on Hugging Face Hub
To any cloud (AWS, and Azure, GCP on the way)
On a range of instance types (including GPU)
We’re switching some of our Machine Learning (ML) models that

To finish reading, please visit source site