Large Language Models: A New Moore’s Law?

A few days ago, Microsoft and NVIDIA introduced Megatron-Turing NLG 530B, a Transformer-based model hailed as “the world’s largest and most powerful generative language model.” This is an impressive show of Machine Learning engineering, no doubt about it. Yet, should we be excited about this mega-model trend? I, for one, am not. Here’s why.

Read more

Course Launch Community Event

We are excited to share that after a lot of work from the Hugging Face team, part 2 of the Hugging Face Course will be released on November 15th! Part 1 focused on teaching you how to use a pretrained model, fine-tune it on a text classification task then upload the result to the Model Hub. Part 2    

Read more

Introducing the πŸ€— Data Measurements Tool: an Interactive Tool for Looking at Datasets

tl;dr: We made a tool you can use online to build, measure, and compare datasets. Click to access the πŸ€— Data Measurements Tool here. As developers of a fast-growing unified repository for Machine Learning datasets (Lhoest et al. 2021), the πŸ€— Hugging Face team has been working on supporting good practices for dataset documentation (McMillan-Major et al., 2021). While static (if evolving) documentation represents a necessary first step in this direction, getting a good sense of what is actually in […]

Read more

Training CodeParrot 🦜 from Scratch

In this blog post we’ll take a look at what it takes to build the technology behind GitHub CoPilot, an application that provides suggestions to programmers as they code. In this step by step guide, we’ll learn how to train a large GPT-2 model called CodeParrot 🦜, entirely from scratch. CodeParrot can auto-complete your Python code – give    

Read more
1 2 3 4 5 6 1,020