Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

ZeroGPU lets anyone spin up powerful Nvidia H200 hardware in Hugging Face Spaces without keeping a GPU locked for idle traffic. It’s efficient, flexible, and ideal for demos but it doesn’t always make full use of everything the GPU and CUDA stack can offer. Generating images or videos can take a significant amount of time. Being able to squeeze out more performance, taking advantage of the H200 hardware, does matter in this case. This is where PyTorch ahead-of-time (AoT) compilation […]

Read more

SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence

This summer, SandboxAQ released the Structurally Augmented IC50 Repository (SAIR), the largest dataset of co-folded 3D protein-ligand structures paired with experimentally measured IC₅₀ labels, directly linking molecular structure to drug potency and overcoming a longstanding scarcity in training data. This dataset is now available on Hugging Face, and for the first time, researchers have open access to more than 5 million AI‑generated, high‑accuracy protein-ligand 3D structures, each paired with validated empirical binding potency data. SAIR is an open-sourced dataset and […]

Read more

Welcome EmbeddingGemma, Google’s new efficient embedding model

Today, Google releases EmbeddingGemma, a state-of-the-art multilingual embedding model perfect for on-device use cases. Designed for speed and efficiency, the model features a compact size of 308M parameters and a 2K context window, unlocking new possibilities for mobile RAG pipelines, agents, and more. EmbeddingGemma is trained to support over 100 languages and is the highest-ranking text-only multilingual embedding model under 500M on the Massive Text Embedding Benchmark (MTEB) at the time of writing. Table of Contents

Read more

mmBERT: ModernBERT goes Multilingual

This blog post introduces mmBERT, a state-of-the-art massively multilingual encoder model trained on 3T+ tokens of text in over 1800 languages. It shows significant performance and speed improvements over previous multilingual models, being the first to improve upon XLM-R, while also developing new strategies for effectively learning low-resource languages. mmBERT builds upon ModernBERT for a blazingly fast architecture, and adds novel components to enable efficient multilingual learning. If you are interested in trying out the models yourself, some example boilerplate […]

Read more

Jupyter Agents: training LLMs to reason with notebooks

The past year has been all about giving LLMs more tools and autonomy to solve more complex and open ended tasks. The goal of the Jupyter Agent is to give the model the ultimate tool: code execution. A natural way to display multi-step code execution together with reasoning is within a Jupyter Notebook, which consists of code and markdown cells. So we built Jupyter Agent to act as an agent that can execute code directly inside a Jupyter notebook and […]

Read more

Fine-tune Any LLM from the Hugging Face Hub with Together AI

The pace of AI development today is breathtaking. Every single day, hundreds of new models appear on the Hugging Face Hub, some are specialized variants of popular base models like Llama or Qwen, others feature novel architectures or have been trained from scratch for specific domains. Whether it’s a medical AI trained on clinical data, a coding assistant optimized for a particular programming language, or a multilingual model fine-tuned for specific cultural contexts, the Hugging Face Hub has become the […]

Read more

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

OpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models. In this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can […]

Read more

Visible Watermarking with Gradio

Last year, we shared a blogpost on watermarking, explaining what it means to watermark generative AI content, and why it’s important. The need for watermarking has become even more critical as people all over the world have begun to generate and share AI-generated images, video, audio, and text. Images and video have become so realistic    

Read more

LeRobotDataset:v3.0: Bringing large-scale datasets to lerobot

TL;DR Today we release LeRobotDataset:v3! In our previous LeRobotDataset:v2 release, we stored one episode per file, hitting file-system limitations when scaling datasets to millions of episodes. LeRobotDataset:v3 packs multiple episodes in a single file, using relational metadata to retrieve information at the individual episode level from multi-episode files. The new format also natively supports accessing datasets in streaming mode, allowing to process large datasets on the fly.We provide a one-liner util to convert all datasets in the LeRobotDataset format to […]

Read more
1 60 61 62 63 64 70