Fixing Gradient Accumulation

Our friends at Unsloth shared an issue regarding gradient accumulation yesterday that is affecting the transformers Trainer. The initial report comes from @bnjmn_marie (kudos to him!). Gradient accumulation is supposed to be mathematically equivalent to full batch training; however, losses did not match between training runs where the setting was toggled on and off. Where does it stem from? Inside the modeling code of each model, transformers offers a “default” loss function that’s the most typically    

Read more

Llama 3.2 in Keras

This is going to be the shortest blog post ever. Question: Llama 3.2 landed two weeks ago on Hugging Face / Transformers. When will it be available in Keras? Answer: It has been working from day 1 😀. There is nothing to wait for. Yes, Keras Llama3 can be loaded from any standard (i.e. safetensors) Hugging    

Read more

Deploying Speech-to-Speech on Hugging Face

Speech-to-Speech (S2S) is an exciting new project from Hugging Face that combines several advanced models to create a seamless, almost magical experience: you speak, and the system responds with a synthesized voice. The project implements a cascaded pipeline leveraging models available through the Transformers library on the Hugging Face hub. The pipeline consists of the following components: Voice Activity Detection (VAD) Speech to Text (STT) Language Model (LM) Text to Speech (TTS) What’s more, S2S has multi-language support! It currently […]

Read more

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Stable Diffusion 3.5 is the improved variant of its predecessor, Stable Diffusion 3. As of today, the models are available on the Hugging Face Hub and can be used with 🧨 Diffusers. The release comes with two checkpoints: A large (8B) model A large (8B) timestep-distilled model enabling few-step inference In this post, we will focus on how to use Stable Diffusion 3.5 (SD3.5) with Diffusers, covering both inference and training. Table Of Contents

Read more

CinePile 2.0 – making stronger datasets with adversarial refinement

In this blog post we share the journey of releasing CinePile 2.0, a significantly improved version of our long video QA dataset. The improvements in the new dataset rely on a new approach that we coined adversarial dataset refinement. We’re excited to share both CinePile 2.0 and our adversarial refinement method implementation, which we believe can strengthen many existing datasets and directly be part of future dataset creation pipelines. If you are mainly interested in the adversarial refinement method, you […]

Read more

Introducing SynthID Text

Do you find it difficult to tell if text was written by a human or generated by AI? Being able to identify AI-generated content is essential to promoting trust in information, and helping to address problems such as misattribution and misinformation. Today, Google DeepMind and Hugging Face are excited to launch SynthID Text in Transformers v4.46.0, releasing later today. This technology allows you to apply watermarks to AI-generated text using a logits processor for generation tasks, and detect those watermarks […]

Read more

A Deepdive into Aya Expanse: Advancing the Frontier of Multilinguality

This is a guest blog post by the Cohere For AI team. Cohere For AI is Cohere’s research lab that seeks to solve complex machine learning problems. With the release of the Aya Expanse family, featuring 8B and 32B parameter models, we are addressing one of the most urgent challenges in AI: the lack of highly performant multilingual models that can rival the capabilities of monolingual ones. While AI has made tremendous progress, there remains a stark gap in the […]

Read more

Expert Support case study: Bolstering a RAG app with LLM-as-a-Judge

This is a guest blog post authored by Digital Green. Digital green is participating in a CGIAR-led collaboration to bring agricultural support to smallholder farmers. There are an estimated 500 million smallholder farmers globally: they play a critical role in global food security. Timely access to accurate information is essential for these farmers to make informed decisions and improve their yields. An “agricultural extension service” offers technical advice on agriculture to farmers, and also supplies them with the necessary inputs […]

Read more

Universal Assisted Generation: Faster Decoding with Any Assistant Model

TL;DR: Many LLMs such as gemma-2-9b and Mixtral-8x22B-Instruct-v0.1 lack a much smaller version to use for assisted generation. In this blog post, we present Universal Assisted Generation: a method developed by Intel Labs and Hugging Face which extends assisted generation to work with a small language model from any model family 🤯. As a result, it is now possible to accelerate inference from any decoder or Mixture of Experts model by 1.5x-2.0x with almost zero overhead 🔥🔥🔥. Let’s dive in! […]

Read more
1 44 45 46 47 48 70