March 13, 2026 huggingface

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

This is the third and final blog in a three-part series on China’s open source community’s historical advancements since January 2025’s “DeepSeek Moment.” The first blog on strategic changes and open artifact growth is available here,

March 13, 2026 huggingface

H Company’s new Holo2 model takes the lead in UI Localization

Two months since releasing our first batch of Holo2 models, H Company is back with our largest UI localization model yet: Holo2-235B-A22B Preview. This model achieves a new State-of-the-Art (SOTA) record of 78.5% on Screenspot-Pro and 79.0% on OSWorld G. Available on Hugging Face, Holo2-235B-A22B Preview is a research release focused on UI element localization. Agentic Localization High-resolution 4K interfaces are challenging for localization models. Small UI elements can be difficult to pinpoint on a large display. With agentic localization, […]

March 13, 2026 huggingface

Community Evals: Because we’re done trusting black-box leaderboards over the community

TL;DR: Benchmark datasets on Hugging Face can now host leaderboards. Models store their own eval scores. Everything links together. The community can submit results via PR. Verified badges prove that the results can be reproduced. Evaluation is broken Let’s be real about where we are with evals in 2026. MMLU is saturated above 91%. GSM8K hit 94%+. HumanEval is conquered. Yet some models that ace benchmarks still can’t reliably browse the web, write

March 13, 2026 huggingface

Introducing SyGra Studio

SyGra 2.0.0 introduces Studio, an interactive environment that turns synthetic data generation into a transparent, visual craft. Instead of juggling YAML files and terminals, you compose flows directly on the canvas, preview datasets before committing, tune prompts with inline variable hints, and watch executions stream live—all from a single pane. Under the hood it’s the same platform, so everything you do visually generates the corresponding SyGra compatible graph config and task executor scripts. What Studio lets you do

March 13, 2026 huggingface

Transformers.js v4 Preview: Now Available on NPM!

We’re excited to announce that Transformers.js v4 (preview) is now available on NPM! After nearly a year of development (we started in March 2025 🤯), we’re finally ready for you to test it out. Previously, users had to

March 13, 2026 huggingface

Custom Kernels for All from Codex and Claude

tl;dr: We built an agent skill that teaches coding agents how to write production CUDA kernels. Then we pointed Claude and Codex at two real targets: a diffusers pipeline and a transformers model. The agents produced working kernels for both, with correct PyTorch bindings and benchmarks, end to end. Writing CUDA kernels is hard. Writing CUDA kernels that correctly integrate with transformers and diffusers is harder. There are architecture-specific memory access patterns, vectorization strategies, warp shuffle reductions, and a dozen […]

March 13, 2026 huggingface

One-Shot Any Web App with Gradio’s gr.HTML

Gradio 6 quietly shipped a very powerful feature: gr.HTML now supports custom templates, scoped CSS, and JavaScript interactivity. Which means you can build pretty much any web component — and Claude (or any other frontier LLM) can generate the whole thing in one shot: frontend, backend, and state management, all in a single Python file. We tested this by building different types of apps. Each one is a single Python file, no build step, deployable to Hugging Face Spaces in […]

March 13, 2026 huggingface

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

Ayhan Sebin Saurabh Jha Rohan Arora Daby Sow Mert Cemri Melissa Pan Ion Stoica ITBench HF Space ITBench HF Dataset MAST HF Dataset ITBench Github MAST Github IBM Research and UC Berkeley collaborated to study how agentic LLM systems break in real-world IT automation, for tasks involving incident triage, logs/metrics queries, and Kubernetes actions in long-horizon tool loops. Benchmarks typically reduce performance to a single number, telling you whether an agent failed but never why. To solve this black-box […]

March 13, 2026 huggingface

Train AI models with Unsloth and Hugging Face Jobs for FREE

This blog post covers how to use Unsloth and Hugging Face Jobs for fast LLM fine-tuning (specifically LiquidAI/LFM2.5-1.2B-Instruct ) through coding agents like Claude Code and Codex. Unsloth provides ~2x faster training and ~60% less VRAM usage compared to standard methods, so training small models can cost just a few dollars. Why a small model? Small language models like LFM2.5-1.2B-Instruct are ideal candidates for fine-tuning. They are cheap to train, fast to iterate on, and increasingly competitive with much larger […]

March 13, 2026 huggingface

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

We are super happy to announce that GGML, creators of Llama.cpp, are joining HF in order to keep future AI open. 🔥 Georgi Gerganov and team are joining HF with the goal of scaling and supporting the community behind ggml and llama.cpp as Local AI continues to make exponential progress in the coming years. We’ve been working with Georgi and team for quite some time (we even have awesome core contributors to llama.cpp like Son and Alek in the team […]

« 1 … 67 68 69 70 »