The Technology Behind BLOOM Training
In recent years, training ever larger language models has become the norm. While the issues of those models’ not being released for further study is frequently discussed, the hidden knowledge about how to train such models rarely gets any attention. This article aims to change this by shedding some light on the technology and engineering behind training such models both in terms