March 13, 2026 huggingface

🚀 Accelerating LLM Inference with TGI on Intel Gaudi

We’re excited to announce the native integration of Intel Gaudi hardware support directly into Text Generation Inference (TGI), our production-ready serving solution for Large Language Models (LLMs). This integration brings the power of Intel’s specialized AI accelerators to our high-performance inference stack, enabling more deployment options for the open-source AI community 🎉 ✨ What’s New? We’ve fully integrated Gaudi support into TGI’s main codebase in PR #3091. Previously, we maintained a separate fork for Gaudi devices

March 13, 2026 huggingface

How Hugging Face Scaled Secrets Management for AI Infrastructure

Hugging Face has become synonymous with advancing AI at scale. With over 4 million builders deploying models on the Hub, the rapid growth of the platform necessitated a rethinking of how sensitive configuration data —secrets— are managed. Last year, the engineering teams set out to improve the handling of their secrets and credentials. After evaluating tools like HashiCorp Vault, they ultimately chose

March 13, 2026 huggingface

Efficient Request Queueing – Optimizing LLM Performance

Serving LLMs to many applications and users in parallel is challenging because they compete for limited GPU resources. This article is the first in a series on LLM performance, based on our experience with serving self-hosted LLMs at TNG Technology Consulting GmbH. In the first part, we focus on the impact of queuing and discuss different scheduling strategies.

March 13, 2026 huggingface

The NLP Course is becoming the LLM Course!

Education has always been at the heart of Hugging Face’s mission to democratize AI and we’re doubling down on that by giving hf.co/learn a big upgrade! Our NLP course has been a go-to resource for the open-source AI community for the past 3 years, and it’s now time for a refresh. We’re updating and expanding it to keep up with all the exciting stuff happening in AI (which is not easy when there are breakthroughs every week!) We felt the […]

March 13, 2026 huggingface

Journey to 1 Million Gradio Users!

5 years ago, we launched Gradio as a simple Python library to let researchers at Stanford easily demo computer vision models with a web interface. Today, Gradio is used by >1 million developers each month to build and share AI web apps. This includes some of the most popular open-source projects of all time, like Automatic1111,

March 13, 2026 huggingface

Welcome Llama 4 Maverick & Scout on Hugging Face

We are incredibly excited to welcome the next generation of large language models from Meta to the Hugging Face Hub: Llama 4 Maverick (~400B) and Llama 4 Scout (~109B)! 🤗 Both are Mixture of Experts (MoE) models with 17B active parameters. Released today, these powerful, natively multimodal models represent a significant leap forward. We’ve worked closely with Meta to ensure seamless integration into the Hugging Face ecosystem, including both transformers and TGI from day one. This is just the start […]

March 13, 2026 huggingface

Arabic Leaderboards: Introducing Arabic Instruction Following, Updating AraGen, and More

At Inception, we have been working to enhance AI model evaluations within the Arabic language context. Previously, we introduced AraGen, one of the first generative Arabic leaderboards, serving as a benchmark for evaluating Arabic LLMs on generative tasks. As part of our ongoing efforts, we are excited to share the following updates: Arabic-Leaderboards Space, launched in collaboration with Mohammed bin Zayed University of Artificial Intelligence (MBZUAI) to consolidate Arabic AI evaluations in one place. This platform currently supports AraGen-03-25 and […]

March 13, 2026 huggingface

Hugging Face and Cloudflare Partner to Make Real-Time Speech and Video Seamless with FastRTC

We’re excited to announce a new partnership between Cloudflare and Hugging Face that gives FastRTC developers instant access to enterprise-grade WebRTC infrastructure with a Hugging Face token. As a preview of what you can build with FastRTC and Cloudflare, check out this voice chat app built with Meta’s new Llama 4 model!

March 13, 2026 huggingface

Visual Salamandra: Pushing the Boundaries of Multimodal Understanding

The Language Technologies Lab takes a major step forward in multimodal artificial intelligence with the release of Visual Salamandra, extending the capabilities of the Salamandra large language

March 13, 2026 huggingface

4M Models Scanned: Protect AI + Hugging Face 6 Months In

Hugging Face and Protect AI partnered in October 2024 to enhance machine learning (ML) model security through Guardian’s scanning technology for the community of developers who explore and use models from the Hugging Face Hub. The partnership has been a natural fit from the start—Hugging Face is on a mission to democratize

« 1 … 52 53 54 55 56 … 1,021 »