Transformers backend integration in SGLang

Hugging Face transformers library is the standard for working with state-of-the-art models — from experimenting with cutting-edge research to fine-tuning on custom data. Its simplicity, flexibility, and expansive model zoo make it a powerful tool for rapid development. But once you’re ready to move from notebooks to production, inference performance becomes mission-critical. That’s where SGLang comes in. Designed for high-throughput, low-latency inference, SGLang now offers seamless integration with transformers as a backend. This means you can pair the flexibility of […]

Read more

Gemma 3n fully available in the open-source ecosystem!

Gemma 3n was announced as a preview during Google I/O. The on-device community got really excited, because this is a model designed from the ground up to run locally on your hardware. On top of that, it’s natively multimodal, supporting image, text, audio, and video inputs 🤯 Today, Gemma 3n is finally available on the most used open source libraries. This includes transformers & timm, MLX, llama.cpp (text inputs), transformers.js, ollama, Google AI Edge, and others. This post quickly goes […]

Read more

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

NVIDIA Llama Nemotron Nano VL is a state-of-the-art 8B Vision Language Model (VLM) designed for intelligent document processing, offering high accuracy and multimodal understanding. Available on Hugging Face, it excels in extracting and understanding information from complex documents like invoices, receipts, contracts, and more. With its powerful OCR capabilities and efficient performance on the OCRBench v2 benchmark, this model delivers industry-leading accuracy for text and table extraction, as well as chart, diagram, and table parsing. Whether you’re automating financial document […]

Read more

Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models

Join us in building benchmarks that capture early-stage reasoning & scientific knowledge in LLMs! The development of Large Language Models (LLMs) typically begins with a series of ablation experiments, wherein various model architectures, data mixtures, and training hyperparameters are systematically evaluated. This phase is commonly referred to as the early stages of training. During this period, researchers primarily monitor two key metrics: the training loss curve and evaluation scores. However, existing evaluation benchmarks often fail to provide meaningful or discriminative […]

Read more

Efficient MultiModal Data Pipeline

You’ve got everything ready – data, model, a beefy GPU setup. You hit “run” and… wait. And wait some more. Your GPUs are barely breaking a sweat while your wallet’s getting lighter by the hour. Sound familiar? We’ve been there. After some detective work on our nanoVLM project, we discovered the real culprit wasn’t our model or hardware, it was our data pipeline being incredibly wasteful. Here’s what we found: Idle GPUs: Our model was literally waiting around for data […]

Read more

SmolLM3: smol, multilingual, long-context reasoner

Small language models are becoming increasingly important as users seek capable models that can be deployed efficiently. The community has produced a fascinating range of capable small models, each pushing the boundaries of what’s possible at this scale. With SmolLM3, we’re excited to contribute a new competitive fully open 3B model: SmolLM3 sits in the efficiency sweet spot. Our 3B model outperforms Llama-3.2-3B and Qwen2.5-3B while staying competitive with larger 4B alternatives (Qwen3 & Gemma3). Beyond the performance numbers, we’re […]

Read more

Upskill your LLMs with Gradio MCP Servers

Upskill your LLMs With Gradio MCP Servers Have you ever wanted your favorite Large Language Model (LLM) to do more than just answer questions? What if it could edit images for you, browse the web, or organize your email inbox? Well, now it can! In this blog post, I’ll show you: What the MCP protocol is and how it works similarly to    

Read more
1 56 57 58 59 60 70