March 13, 2026 huggingface

AI Policy @🤗: Response to the White House AI Action Plan RFI

On March 14, we submitted Hugging Face’s response to the White House Office of Science and Technology Policy’s request for information on the White House AI Action Plan. We took this opportunity to (re-)assert the fundamental role that open AI systems and open science play in enabling the technology to be more performant and efficient, broadly and reliably adopted, and meeting the highest standards of security. This blog post provides a summary of our response, the full text is available […]

March 13, 2026 huggingface

Open R1: How to use OlympicCoder locally for coding

Everyone’s been using Claude and OpenAI as coding assistants for the last few years, but there’s less appeal if you look at the developments coming out of open source projects like Open R1. If we look at the evaluation on LiveCodeBench below, we can see that the 7B parameter variant outperforms Claude 3.7 Sonnet and GPT-4o. These models are the daily drivers of many engineers in applications like Cursor and VSCode. Evals are great and all, but I want to […]

March 13, 2026 huggingface

Analytics is important

Analytics and metrics are the cornerstone of understanding what’s happening with your deployment. Are your Inference Endpoints overloaded? How many requests are they handling? Having well-visualized, relevant metrics displayed in real-time is crucial for monitoring and debugging. We realized that our analytics dashboard needed a refresh. Since we debug a lot of endpoints ourselves, we’ve felt the same pain as our users. That’s why we sat down to plan and make several improvements to provide a better experience for you. […]

March 13, 2026 huggingface

Introducing Gradio’s new Dataframe!

Gradio’s gr.Dataframe component is one of our most popular components, we’ve seen it used in a variety of awesome apps, like leaderboards, dashboards, and interactive visualisations. Although we hadn’t made any changes to the dataframe in quite some time, our backlog

March 13, 2026 huggingface

Training and Finetuning Reranker Models with Sentence Transformers v4

Sentence Transformers is a Python library for using and training embedding and reranker models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. Its v4.0 update introduces a new training approach for rerankers, also known as cross-encoder models, similar to what the v3.0 update introduced for embedding models. In

March 13, 2026 huggingface

Open R1: Update #4

Welcome DeepSeek-V3 0324 This week, a new model from DeepSeek silently landed on the Hub. It’s an updated version of DeepSeek-V3, the base model underlying the R1 reasoning model. There isn’t much information shared yet on this new model, but we do know a few things! What we know so far

March 13, 2026 huggingface

🚀 Accelerating LLM Inference with TGI on Intel Gaudi

We’re excited to announce the native integration of Intel Gaudi hardware support directly into Text Generation Inference (TGI), our production-ready serving solution for Large Language Models (LLMs). This integration brings the power of Intel’s specialized AI accelerators to our high-performance inference stack, enabling more deployment options for the open-source AI community 🎉 ✨ What’s New? We’ve fully integrated Gaudi support into TGI’s main codebase in PR #3091. Previously, we maintained a separate fork for Gaudi devices

March 13, 2026 huggingface

How Hugging Face Scaled Secrets Management for AI Infrastructure

Hugging Face has become synonymous with advancing AI at scale. With over 4 million builders deploying models on the Hub, the rapid growth of the platform necessitated a rethinking of how sensitive configuration data —secrets— are managed. Last year, the engineering teams set out to improve the handling of their secrets and credentials. After evaluating tools like HashiCorp Vault, they ultimately chose

March 13, 2026 huggingface

Efficient Request Queueing – Optimizing LLM Performance

Serving LLMs to many applications and users in parallel is challenging because they compete for limited GPU resources. This article is the first in a series on LLM performance, based on our experience with serving self-hosted LLMs at TNG Technology Consulting GmbH. In the first part, we focus on the impact of queuing and discuss different scheduling strategies.

March 13, 2026 huggingface

The NLP Course is becoming the LLM Course!

Education has always been at the heart of Hugging Face’s mission to democratize AI and we’re doubling down on that by giving hf.co/learn a big upgrade! Our NLP course has been a go-to resource for the open-source AI community for the past 3 years, and it’s now time for a refresh. We’re updating and expanding it to keep up with all the exciting stuff happening in AI (which is not easy when there are breakthroughs every week!) We felt the […]

« 1 … 51 52 53 54 55 … 70 »