Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub
NVIDIA Llama Nemotron Nano VL is a state-of-the-art 8B Vision Language Model (VLM) designed for intelligent document processing, offering high accuracy and multimodal understanding. Available on Hugging Face, it excels in extracting and understanding information from complex documents like invoices, receipts, contracts, and more. With its powerful OCR capabilities and efficient performance on the OCRBench v2 benchmark, this model delivers industry-leading accuracy for text and table extraction, as well as chart, diagram, and table parsing. Whether you’re automating financial document processing or improving business intelligence workflows, Llama Nemotron Nano VL is optimized for fast, scalable deployments.
Check