Remote VAEs for decoding with Inference Endpoints 🤗
(This post was authored by hlky and Sayak)
When operating with latent-space diffusion models for high-resolution image and video synthesis, the VAE decoder can consume quite a bit more memory. This makes it hard for the users to run