AI and the Future of Cybersecurity: Why Openness Matters

Following the announcement of Mythos and Project Glasswing, institutions throughout the world are grappling with the potential dawn of a new era of cybersecurity. In this post, we break down the current situation, discuss the role of openness, and situate the future of cybersecurity within the larger AI ecosystem. What is Mythos? Mythos is a “frontier AI model”, a large language model (LLM) that can be used to process software code (among many other things). This follows a general trend […]

Read more

QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

QIMMA validates benchmarks before evaluating models, ensuring reported scores reflect genuine Arabic language capability in LLMs. If you’ve been tracking Arabic LLM evaluation, you’ve probably noticed a growing tension: the number of benchmarks and leaderboards is expanding rapidly, but are we actually measuring what we think we’re measuring? We built QIMMA قمّة (Arabic for “summit”), to answer that question systematically. Instead of aggregating existing Arabic benchmarks as-is and running models on them, we applied a rigorous quality validation pipeline before […]

Read more

How to Use Transformers.js in a Chrome Extension

We recently released a Transformers.js demo browser extension powered by Gemma 4 E2B to help users navigate the web. While building it, we ran into several practical observations about Manifest V3 runtimes, model loading, and messaging that are worth sharing. Who this is for This guide is for developers who want to run local AI features in a Chrome extension with Transformers.js under Manifest V3 constraints. By the end, you will have the same architecture used in this project: a […]

Read more

DeepSeek-V4: a million-token context that agents can actually use

DeepSeek released V4 today. Two MoE checkpoints are on the Hub: DeepSeek-V4-Pro at 1.6T total parameters with 49B active, and DeepSeek-V4-Flash at 284B total with 13B active. Both have a 1M-token context window. The benchmark numbers are competitive, but not SOTA. It doesn’t matter. The real innovation is how DeepSeek v4 is designed for efficient large context length support, and hence as one of the best candidates for agentic tasks. Focusing on long running agentic workloads. Running a frontier open […]

Read more

Gemma 4 VLA Demo on Jetson Orin Nano Super

Talk to Gemma 4, and she’ll decide on her own if she needs to look through the webcam to answer you. All running locally on a Jetson Orin Nano Super. You speak → Parakeet STT → Gemma 4 → [Webcam if needed] → Kokoro TTS → Speaker Press SPACE to record, SPACE again to stop. This is a simple VLA: the model decides on its own whether to act based on the context of what you asked, no keyword triggers, […]

Read more

Training and Finetuning Embedding Models with Sentence Transformers

Sentence Transformers is a Python library for using and training embedding models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. In this blogpost, I’ll show you how to use it to finetune Sentence Transformer models to improve their performance on specific tasks. You can also use this method to train new Sentence Transformer models from scratch. Finetuning Sentence Transformers involves several components, including datasets, loss functions, training arguments, […]

Read more

Training and Finetuning Reranker Models with Sentence Transformers

Sentence Transformers is a Python library for using and training embedding and reranker models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. In this blogpost, I’ll show you how to use it to finetune a reranker model (also known as a cross-encoder) that beats all existing options on exactly your data. This method can also train extremely strong new reranker models from scratch. Finetuning reranker models involves several […]

Read more

Training and Finetuning Sparse Embedding Models with Sentence Transformers

Sentence Transformers is a Python library for using and training dense embedding, reranker (cross encoder), and sparse embedding models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. In this blogpost, I’ll show you how to use it to finetune a sparse encoder/embedding model and explain why you might want to do so. This results in sparse-encoder/example-inference-free-splade-distilbert-base-uncased-nq, a cheap model that works especially well in hybrid search or retrieve […]

Read more

Meet HoloTab by HCompany. Your AI browser companion.

We built one of the most powerful computer-use AIs in the world. And made it directly accessible from your browser. On March 31st, we released Holo3, our most advanced computer-use model to date. Building something powerful is one thing; making it accessible and easy to use is another. We’re doing both. HoloTab is a Chrome extension that navigates the web just like a person would. It automates tasks across any website with zero setup or technical skills required. You describe […]

Read more

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

VAKRA Dataset | LeaderBoard | Release Blog | GitHub | Submit to Leaderboard We recently introduced VAKRA, a tool-grounded, executable benchmark for evaluating how well AI agents reason and act in enterprise-like environments. Unlike traditional benchmarks that test isolated skills, VAKRA measures compositional reasoning across APIs and documents, using full execution traces to assess whether agents can reliably complete multi-step workflows. VAKRA provides an executable environment where agents interact with over 8,000+ locally hosted APIs backed by real databases spanning […]

Read more
1 2 3 74