Goodbye cold boot – how we made LoRA Inference 300% faster

raphael g's avatar

tl;dr: We swap the Stable Diffusion LoRA adapters per user request, while keeping the base model warm allowing fast LoRA inference across multiple users. You can experience this by browsing our LoRA catalogue and playing with the inference widget.

Inference Widget Example

In this blog we will go in detail over how we achieved that.

We’ve

 

 

 

To finish reading, please visit source site