Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

Morgan Funtowicz's avatar
Hugo Larcher's avatar

Since its initial release in 2022, Text-Generation-Inference (TGI) has provided Hugging Face and the AI Community with a performance-focused solution to easily deploy large-language models (LLMs). TGI initially offered an almost no-code solution to load models from the Hugging

 

 

 

To finish reading, please visit source site