Introducing RTEB: A New Standard for Retrieval Evaluation

TL;DR – We’re excited to introduce the beta version of the Retrieval Embedding Benchmark (RTEB), a new benchmark designed to reliably evaluate the retrieval accuracy of embedding models for real-world applications. Existing benchmarks struggle to measure true generalization, while RTEB addresses this with a hybrid strategy of open and private datasets. Its goal is simple: to create a fair, transparent, and application-focused standard for measuring how models perform on data they haven’t seen before. The performance of many AI applications, […]

Read more

BigCodeArena: Judging code generations end to end with code executions

Evaluating the quality of AI-generated code is notoriously difficult. While humans can easily spot whether a piece of code “looks right,” determining if it actually works correctly, handles edge cases properly, and produces the intended result requires running and testing it. This is why today, we’re thrilled to announce BigCodeArena — the first human-in-the-loop platform for evaluating code generation models    

Read more

Nemotron-Personas-India: Synthesized Data for Sovereign AI

A compound AI approach to Indian personas grounded in real-world distributions Open Data for India’s AI Future India represents one of the world’s largest AI opportunities — with over 700 million internet users, a multitude of languages, and a rapidly growing developer ecosystem. Yet, most open datasets reflect Western norms and English-only contexts, creating a data gap that limits AI adoption in India’s multilingual, multi-script environment. Today, we’re releasing Nemotron-Personas-India, the first    

Read more

Get your VLM running in 3 simple steps on Intel CPUs

With the growing capability of large language models (LLMs), a new class of models has emerged: Vision Language Models (VLMs). These models can analyze images and videos to describe scenes, create captions, and answer questions about visual content. While running AI models on your own device can be difficult as these models are often computationally demanding, it also offers significant benefits: including improved privacy since your data stays on your machine, and enhanced speed and reliability because you’re not dependent […]

Read more

Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face

Intel and Hugging Face collaborated to demonstrate the real-world value of upgrading to Google’s latest C4 Virtual Machine (VM) running on Intel® Xeon® 6 processors (codenamed Granite Rapids (GNR)). We specifically wanted to benchmark improvements in the text generation performance of OpenAI GPT OSS Large Language Model(LLM). The results are in, and they are impressive, demonstrating a 1.7x improvement in Total Cost of Ownership(TCO) over the previous-generation Google C3 VM instances. The Google Cloud C4 VM instance further resulted in: […]

Read more

AI for Food Allergies

Let’s get straight to the point: worldwide, an estimated 220 million people suffer from at least one food allergy, and in the United States alone, this accounts for roughly 10% of the population. This means that if you don’t have an allergy, you’ll likely know someone who does — and it’s not a pleasant situation to be in. This condition affects not only patients’ physical health but also takes a significant toll on their mental well-being and overall quality of […]

Read more

Supercharge your OCR Pipelines with Open Models

We have added Chandra and OlmOCR-2 to this blog, as well as OlmOCR Scores of the models 🫡 TL;DR: The rise of powerful vision-language models has transformed document AI. Each model comes with unique strengths, making it tricky to choose the right one. Open-weight models offer better cost efficiency and privacy. To help you get started with them, we’ve put together this guide. In this guide, you’ll learn: The landscape of current models and their capabilities When to fine-tune models […]

Read more

Sentence Transformers is joining Hugging Face!

Today, we are announcing that Sentence Transformers is transitioning from Iryna Gurevych’s Ubiquitous Knowledge Processing (UKP) Lab at the TU Darmstadt to Hugging Face. Hugging Face’s Tom Aarsen has already been maintaining the library since late 2023 and will continue to lead the project. At its new home, Sentence Transformers will benefit from Hugging Face’s robust infrastructure, including continuous    

Read more

Building the Open Agent Ecosystem Together: Introducing OpenEnv

With tools like TRL, TorchForge and verl, the open-source community has shown how to scale AI across complex compute infrastructure. But compute is only one side of the coin. The other side is the developer community; the people and tools that make agentic systems possible. That’s why Meta and Hugging Face are partnering to launch the OpenEnv Hub: a shared and open community hub for agentic environments. Agentic environments define everything an agent needs to perform a task: the tools, […]

Read more
1 62 63 64 65 66 70