Meet HoloTab by HCompany. Your AI browser companion.

We built one of the most powerful computer-use AIs in the world. And made it directly accessible from your browser. On March 31st, we released Holo3, our most advanced computer-use model to date. Building something powerful is one thing; making it accessible and easy to use is another. We’re doing both. HoloTab is a Chrome extension that navigates the web just like a person would. It automates tasks across any website with zero setup or technical skills required. You describe […]

Read more

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

VAKRA Dataset | LeaderBoard | Release Blog | GitHub | Submit to Leaderboard We recently introduced VAKRA, a tool-grounded, executable benchmark for evaluating how well AI agents reason and act in enterprise-like environments. Unlike traditional benchmarks that test isolated skills, VAKRA measures compositional reasoning across APIs and documents, using full execution traces to assess whether agents can reliably complete multi-step workflows. VAKRA provides an executable environment where agents interact with over 8,000+ locally hosted APIs backed by real databases spanning […]

Read more

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Sentence Transformers is a Python library for using and training embedding and reranker models for applications like retrieval augmented generation, semantic search, and more. In my previous blogpost, I introduced the new multimodal capabilities, showing how to use embedding and reranker models that handle text, images, audio, and video. In this blogpost, I’ll show you how to train or finetune these multimodal models on your own data. As a practical example, I’ll walk through finetuning Qwen/Qwen3-VL-Embedding-2B for Visual Document Retrieval […]

Read more

The PR you would have opened yourself

Making transformers models available in mlx-lm using a Skill and test harness TL;DR We provide a Skill and a test harness to help port language models from transformers to mlx-lm, so they become (almost) instantly available the moment they are added to transformers. The Skill is designed to support contributors and reviewers as an aide, not an automation. We explain why we did it, how, and comment about how to meaningfully contribute to open source in the age of agents. […]

Read more

Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents

TL;DR — We extend the RLVE framework from single-turn reasoning puzzles to multi-turn, tool-augmented e-commerce conversations. EcomRLVE-GYM provides 8 verifiable environments — product discovery, substitution, cart building, returns, order tracking, policy QA, bundle planning, and multi-intent journeys — each with procedural problem generation, a 12-axis difficulty curriculum, and algorithmically verifiable rewards. We train a Qwen 3 8B model with DAPO over 300 steps and present early results demonstrating that environment scaling and adaptive difficulty transfer to agentic, real-world task completion. […]

Read more

Building a Fast Multilingual OCR Model with Synthetic Data

Training a high-quality OCR model requires a large quantity of annotated image-text pairs: images with precise bounding boxes, transcriptions, and ideally reading order information at the word, line, and paragraph level. Every approach to curating this data comes with tradeoffs. Existing benchmark datasets like ICDAR and Total-Text have clean labels but limited scale, typically tens of thousands of images skewed toward English and Chinese. Manual annotation produces the highest quality labels but is expensive and slow, making it impractical at […]

Read more

Training mRNA Language Models Across 25 Species for $165

By OpenMed, Open-Source Agentic AI for Healthcare & Life Sciences TL;DR: We built an end-to-end protein AI pipeline covering structure prediction, sequence design, and codon optimization. After comparing multiple transformer architectures for codon-level language modeling, CodonRoBERTa-large-v2 emerged as the clear winner with a perplexity of 4.10 and a Spearman CAI correlation of 0.40, significantly outperforming ModernBERT. We then scaled to 25 species, trained 4 production models in 55 GPU-hours, and built a species-conditioned system that no other open-source project offers. […]

Read more

gradio.Server: Any Custom Frontend with Gradio’s Backend

A few weeks ago, we wrote about one-shotting full web apps with gr.HTML: building rich, interactive frontends entirely inside Gradio using custom HTML, CSS, and JavaScript. That unlocked a lot. But what if that’s not enough? What if you want to build with your own frontend framework entirely like React, Svelte, or even plain HTML/JS, while still benefiting from Gradio’s queuing system, API infrastructure, MCP support, and ZeroGPU on Spaces? That’s exactly the problem gradio.Server solves. And it changes what’s […]

Read more

Safetensors is Joining the PyTorch Foundation

Today, we’re announcing that Safetensors has joined the PyTorch Foundation as a foundation-hosted project under the Linux Foundation, alongside DeepSpeed, Helion, Ray, vLLM, and PyTorch itself. How we got here Safetensors started as a Hugging Face project born out of a concrete need: a way to store and share model weights that couldn’t execute arbitrary code. The pickle-based formats that dominated the ecosystem at the time meant that there was a very real risk you’d be running malicious code. While […]

Read more

ALTK‑Evolve: On‑the‑Job Learning for AI Agents

Most AI agents re‑read transcripts instead of learning principles, so they repeat mistakes and don’t transfer lessons to new situations. ALTK‑Evolve turns raw agent trajectories into reusable guidelines. In benchmarks, the approach boosted reliability, especially on hard (Δ 14.2% on AppWorld), multi‑step tasks, without bloating context. The “eternal intern” problem Imagine a brilliant line cook who has memorized every cookbook but forgets your kitchen every morning. They don’t remember your oven runs hot, or that regulars like extra salt; they’ll […]

Read more
1 3 4 5 6 7 78