Deep Learning over the Internet: Training Language Models Collaboratively
With the additional help of Quentin Lhoest and Sylvain Lesage.
Modern language models often require a significant amount of compute for pretraining, making it impossible to obtain them without access to tens and hundreds of GPUs or