Holo3.1: Fast & Local Computer Use Agents
Users want to run the same computer-use capabilities across desktop and mobile environments, with seamless integration with different agent frameworks. They want deployment flexibility, from cloud inference to fully local execution on end-user devices.
This is why we are releasing the Holo3.1 family. Holo3.1 improves robustness across the three dimensions that matter most in production: environments (web, desktop, mobile), agent frameworks, and deployment targets. For the first time, we release quantized checkpoints optimized for local inference, including FP8, Q4 GGUF, and NVFP4.
Holo3.1 is a major step toward our vision of universal computer-use agents: systems that can operate across environments, integrate into any agent stack, and run wherever the workflow lives.
Based on the Qwen family, Holo3.1 was designed to improve robustness across the environments where computer-use agents are actually deployed, while retaining state-of-the-art performance.
As teams moved Holo3 from evaluation to production, we repeatedly observed the same challenge: strong performance