nanoVLM: The simplest repository to train your VLM in pure PyTorch
nanoVLM is the simplest way to get started with training your very own Vision Language Model (VLM) using pure PyTorch. It is lightweight toolkit which allows you to launch a VLM training on a free tier colab notebook. We were inspired by Andrej Karpathy’s nanoGPT, and provide a similar project for the vision domain. At its heart, nanoVLM is a toolkit that helps you build and train a model that can understand both images and text, and then generate text […]
Read more