Holo3: Breaking the Computer Use Frontier

We are proud to unveil Holo3, the latest evolution of our vision for the Autonomous Enterprise. With a score of 78.85% on the OSWorld-Verified benchmark, Holo3 establishes a new state of the art for the industry on the leading desktop computer use benchmark. Holo3 is more than a benchmark leader; it is engineered for production. Built using our agentic flywheel, it has been trained to execute real-world workflows within synthetic enterprise environments. This not only ensures that Holo3 excels in […]

Read more

Welcome Gemma 4: Frontier multimodal intelligence on device

The Gemma 4 family of multimodal models by Google DeepMind is out on Hugging Face, with support for your favorite agents, inference engines, and fine-tuning libraries 🤗 These models are the real deal: truly open with Apache 2 licenses, high quality with pareto frontier arena scores, multimodal including audio, and sizes you can use everywhere including on-device. Gemma 4 builds on advances from previous families and makes them click together. In our tests with pre-release checkpoints we have been impressed […]

Read more

A New Framework for Evaluating Voice Agents (EVA)

Conversational voice agents present a distinct evaluation challenge: they must simultaneously satisfy two objectives — accuracy (completing the user’s task correctly and faithfully) and conversational experience (doing so naturally, concisely, and in a way appropriate for spoken interaction). These objectives are deeply intertwined: mishearing a confirmation code renders perfect LLM reasoning meaningless, a wall of options overwhelms a caller who can’t skim spoken output, and delayed responses can pass every accuracy check while remaining unusable in practice. Existing frameworks treat […]

Read more

Liberate your OpenClaw 🦀

Anthropic is limiting access to Claude models in open agent platforms for Pro/Max subscribers. Don’t worry though, there are great open models on Hugging Face to keep your agents running! Most of the time, at a fraction of the cost. If you’ve been cut off and your OpenClaw, Pi, or Open Code agents need resuscitation, you can move them to open models in two ways: Use an open model served through Hugging Face Inference Providers. Run a fully local open […]

Read more

Holotron-12B – High Throughput Computer Use Agent

We’re thrilled to release Holotron-12B, a multimodal computer-use model from H Company. Post-trained from the open NVIDIA Nemotron-Nano-2 VL model on H Company’s proprietary data mixture, Holotron-12B is the result of a close collaboration between our research labs to engineer a new type of model optimized primarily for scale and performance in production. H Company is part of the NVIDIA Inception Program. The model is now available on Hugging Face. Most multimodal models today optimize primarily for static vision or […]

Read more

State of Open Source on Hugging Face: Spring 2026

This post examines how the open source AI landscape has shifted across competition, geography, technical trends, and emerging communities over the past year. We primarily examine community activity on Hugging Face across many types of metrics to give a holistic view of the ecosystem.  This post builds on an earlier analysis conducted mid-2025, available here, which examined what the Hugging Face Community is building. We recommend reading additional perspectives on the open source ecosystem in and outside of Hugging Face […]

Read more

What’s New in Mellea 0.4.0 + Granite Libraries Release

We have released Mellea 0.4.0 alongside three Granite Libraries: granitelib-rag-r1.0,granitelib-core-r1.0,granitelib-guardian-r1.0. Together, these releases make it easier to build structured, verifiable, and safety-aware AI workflows on top of IBM Granite models. Mellea is an open-source Python library for writing generative programs — replacing probabilistic prompt behavior with structured, maintainable AI workflows. Unlike general-purpose orchestration frameworks, Mellea is designed to make LLM-based programs maintainable and predictable through constrained decoding, structured repair loops, and composable pipelines (New to Mellea? Start with our introductory […]

Read more

Build a Domain-Specific Embedding Model in Under a Day

If you are building a RAG (Retrieval-Augmented Generation) system, you have likely hit this wall: Everything works… until it doesn’t. General-purpose embedding models are trained to understand the internet; not your contracts, manufacturing logs, proprietary chemical formulations or internal taxonomy. They capture broad semantic similarity, but they do not understand the fine-grained distinctions that matter in your domain. Fine-tuning an embedding model can improve the performance of your retrieval pipeline when off-the-shelf models fail to effectively capture domain-specific nuances. Despite […]

Read more

Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline

We are thrilled to announce that NVIDIA NeMo Retriever team has developed a new agentic retrieval pipeline that has officially secured the #1 spot on the ViDoRe v3 pipeline leaderboard. In addition, this exact same pipeline architecture achieved the #2 spot on the highly demanding, reasoning-intensive BRIGHT leaderboard. In the rapidly evolving landscape of AI retrieval, many solutions are highly specialized, engineered to perform exceptionally well on specific, narrow tasks. However, real-world enterprise applications rarely have the luxury of perfectly […]

Read more
1 2 3 4 5 6 74