PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

PaddleOCR 3.5 brings OCR and document parsing tasks closer to the Hugging Face ecosystem. With this release, supported PaddleOCR models can run with Hugging Face Transformers as an inference backend by setting:

engine="transformers"

PaddleOCR continues to provide OCR model series such as PP-OCRv5 and document parsing model series such as PaddleOCR-VL 1.5, while Transformers becomes one of the supported backends for running them.

Try the live demo on Hugging Face Spaces:
https://huggingface.co/spaces/PaddlePaddle/paddleocr-3.5-transformers-demo



What changed?

PaddleOCR 3.5 introduces a more flexible inference-engine interface. Developers can select the backend through the engine parameter and pass backend-specific options through engine_config.

In practice, this means:

  • The pipelines behind these tasks are managed by PaddleOCR, so developers do not need to manually call each internal component.
  • Transformers becomes one of the supported inference backends for running supported PaddleOCR models.
  • Developers can configure backend-related options such as dtype, device placement, and attention implementation through engine_config.

A simple way to understand the stack:

Layer What it means Examples
Application layer Applications that use OCR and document

 

 

 

To finish reading, please visit source site