March 13, 2026 huggingface

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

In the rapidly evolving landscape of large language models (LLMs), comprehensive and robust evaluation methodologies remain a critical challenge, particularly for low-resource languages. In this blog, we introduce AraGen, a generative tasks benchmark and leaderboard for Arabic LLMs, based on 3C3H, a new evaluation measure for NLG which we hope will inspire work for other languages as well. The AraGen leaderboard makes three key contributions: 3C3H Measure: The 3C3H measure scores a model’s response and is central to this framework. […]

March 13, 2026 huggingface

How good are LLMs at fixing their mistakes?

👉 You can play with the Keras chatbot arenawhile you read. Click here to open it in a new tab. 👈 Table of contents 1. Introduction 2. The experiment 3. Keras chatbot arena tech: Spaces, Gradio, TPUs, JAX and Keras 3.1 Why TPUs? 3.2 Why JAX and Keras? 3.3

March 13, 2026 huggingface

Welcome PaliGemma 2 – New vision language models by Google

We are excited to welcome Google’s all-new vision language models, PaliGemma 2, a new iteration of PaliGemma. Like its predecessor, PaliGemma 2 uses the same powerful SigLIP for vision, but it upgrades to the latest Gemma 2 for the text decoder part. PaliGemma 2 comes with new pre-trained (pt) models, in sizes of 3B, 10B, and 28B parameters. All of them support various input resolutions: 224×224, 448×448, and 896×896. These combinations provide a lot of flexibility for different use cases, […]

March 13, 2026 huggingface

Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

The Data is Better Together community releases yet another important dataset for open source development. Due to the lack of open preference datasets for text-to-image generation, we set out to release an Apache 2.0 licensed dataset for text-to-image generation. This dataset is focused on text-to-image preference pairs across common image generation categories, while mixing different model families and varying prompt complexities. TL;DR? All results can be found in this collection on the Hugging Face Hub and code for pre- and […]

March 13, 2026 huggingface

Use Hugging Face models with Amazon Bedrock

We are excited to announce that popular open models from Hugging Face are now available on Amazon Bedrock in the new Bedrock Marketplace! AWS customers can now deploy 83 open models with Bedrock Marketplace to build their Generative AI applications. Under the hood, Bedrock Marketplace model endpoints are managed by Amazon Sagemaker Jumpstart. With Bedrock Marketplace, you can now combine the ease of use of SageMaker JumpStart with the fully managed infrastructure of Amazon Bedrock, including compatibility with high-level APIs […]

March 13, 2026 huggingface

LeMaterial: an open source initiative to accelerate materials discovery and research

Today, we are thrilled to announce the launch of LeMaterial, an open-source collaborative project led by Entalpic and Hugging Face. LeMaterial aims to simplify and accelerate materials research, making it easier to train ML models, identify novel materials and explore chemical spaces. ⚛️🤗 As a first step, we are releasing a dataset called LeMat-Bulk, which unifies, cleans and standardizes the most prominent material datasets, including Materials Project, Alexandria and OQMD — giving rise to a single harmonized data format with […]

March 13, 2026 huggingface

Introducing the Synthetic Data Generator – Build Datasets with Natural Language

Introducing the Synthetic Data Generator, a user-friendly application that takes a no-code approach to creating custom datasets with Large Language Models (LLMs). The best part: A simple step-by-step process, making dataset creation a non-technical breeze, allowing anyone to create datasets and models in minutes and without any code. A short demo video What is synthetic data and why is it useful? Synthetic data is artificially generated information that mimics real-world data. It allows overcoming data limitations by expanding or enhancing […]

March 13, 2026 huggingface

Benchmarking Language Model Performance on 5th Gen Xeon at GCP

TL;DR: We benchmark 2 representative agentic AI workload components, text embedding and text generation, on two Google Cloud Compute Engine Xeon-based CPU instances, namely N2 and C4. The results consistently shows that C4 has 10x to 24x higher throughput over N2 in text embedding and 2.3x to 3.6x higher throughput over N2 in text generation. Taking price into consideration, C4’s hourly price is about 1.3x of N2, in this sense, C4 keeps 7x ~ 19x TCO(Total Cost of Ownership) advantage […]

March 13, 2026 huggingface

Welcome to the Falcon 3 Family of Open Models!

We introduce Falcon3, a family of decoder-only large language models under 10 billion parameters, developed by Technology Innovation Institute (TII) in Abu Dhabi. By pushing the boundaries of performance and training efficiency, this release reflects our ongoing commitment to advancing open and accessible large foundation models. Falcon3 represents a natural evolution from previous releases,

March 13, 2026 huggingface

Bamba: Inference-Efficient Hybrid Mamba2 Model 🐍

We introduce Bamba-9B, an inference-efficient Hybrid Mamba2 model trained by IBM, Princeton, CMU, and UIUC on completely open data. At inference time, the model demonstrates 2.5x throughput improvement and 2x latency speedup compared to standard transformers in vLLM. To foster community experimentation, the model is immediately available to use in transformers, vLLM, TRL, and llama.cpp. We also release tuning, training, and extended pretraining recipes with a stateful data loader, and invite the community to further improve this model. Let’s overcome […]

« 1 … 46 47 48 49 50 … 70 »