Open R1: Update #2

We are now two weeks into the Open R1 project which aims to reconstruct the missing pieces of DeepSeek R1—specifically, the training pipeline and synthetic data.

In this post, we are happy to share the construction of OpenR1-Math-220k: our first large-scale dataset for mathematical reasoning!

We also take a look at some exciting developments from the community towards curating small, high-quality datasets for fine-tuning, along with insights into how to control the length of the chain-of-thought from reasoning models at both train-time and inference-time.

Let’s dive in!

Open R1: Update #2

To finish reading, please visit source site