Post content
Data curation for general reasoning capabilities is still relatively underexplored Researchers systematically compare different metrics for selecting high-quality and diverse reasoning traces in terms of data efficiency in the distillation setting Researchers find diversity in reasoning strategies matters more than topics diversity, and challenging questions are more sample efficient in distilling reasoning capabilities Researchers find that the Less-Is-More approach is not sufficient for solving general reasoning tasks, but scaling up data quantity always brings consistent gains Researchers find that NaturalThoughts outperforms state-of-the-art reasoning datasets such as OpenThoughts3, LIMO, S1k, etc. on general STEM domains Also find that distillation based on reasoning difficulty can improve the pareto frontier of the student model’s inference efficiency Training with a mix of full reasoning traces and the condensed answers enables efficient hybrid reasoning in the student model, by adaptively switching between long chain-of-thought thinking and directly answering