Red-Teaming Large Language Models

Warning: This article is about red-teaming and as such contains examples of model generation that may be offensive or upsetting. Large language models (LLMs) trained on an enormous amount of text data are very good at generating realistic text. However, these models often exhibit undesirable behaviors like revealing personal information (such as social security numbers) and generating misinformation, bias, hatefulness, or toxic content. For example, earlier versions of GPT3 were known to exhibit sexist behaviors (see below) and biases against […]

Read more

How Hugging Face Accelerated Development of Witty Works Writing Assistant

If you’re interested in building ML solutions faster, visit the Expert Acceleration Program landing page and contact us here! Business Context As IT continues to evolve and reshape our world, creating a more diverse and inclusive environment within the industry is imperative. Witty Works was built in 2018 to address this challenge. Starting as a consulting company advising organizations on becoming more diverse, Witty Works first helped them write job ads using inclusive language.    

Read more

Ethical guidelines for developing the Diffusers library

We are on a journey to make our libraries more responsible, one commit at a time! As part of the Diffusers library documentation, we are proud to announce the publication of an ethical framework. Given diffusion models’ real case applications in the world and potential negative impacts on society, this initiative aims to guide the technical decisions of the    

Read more

Ultra fast ControlNet with 🧨 Diffusers

Ever since Stable Diffusion took the world by storm, people have been looking for ways to have more control over the results of the generation process. ControlNet provides a minimal interface allowing users to customize the generation process up to a great extent. With ControlNet, users can easily condition the generation with different spatial contexts such as a depth map, a segmentation map, a scribble, keypoints, and so on! We can turn a cartoon drawing into a realistic photo with […]

Read more

Kakao Brain’s Open Source ViT, ALIGN, and the New COYO Text-Image Dataset

Kakao Brain and Hugging Face are excited to release a new open-source image-text dataset COYO of 700 million pairs and two new visual language models trained on it, ViT and ALIGN. This is the first time ever the ALIGN model is made public for free and open-source use and the first release of ViT and ALIGN models that come with the train dataset. Kakao Brain’s ViT and ALIGN models follow the same architecture and hyperparameters as provided in the original […]

Read more

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

We are excited to officially release the integration of trl with peft to make Large Language Model (LLM) fine-tuning with Reinforcement Learning more accessible to anyone! In this post, we explain why this is a competitive alternative to existing fine-tuning approaches. Note peft is a general tool that can be applied to many ML use-cases but it’s particularly interesting for RLHF as this method is especially memory-hungry! If you want to directly deep dive into the code, check out the […]

Read more

Multivariate Probabilistic Time Series Forecasting with Informer

A few months ago we introduced the Time Series Transformer, which is the vanilla Transformer (Vaswani et al., 2017) applied to forecasting, and showed an example for the univariate probabilistic forecasting task (i.e. predicting each time series’ 1-d distribution individually). In this post we introduce the Informer model (Zhou, Haoyi, et al., 2021), AAAI21 best paper which is now available in 🤗 Transformers. We will show how to use the Informer model for the multivariate probabilistic forecasting task, i.e., predicting […]

Read more
1 17 18 19 20 21 70