Post #936

@LinghaoCh

Parallel Experiments

Views795Post view count

PostedApr 1704/17/2025, 05:22 AM

Post content

Truly a thought-provoking piece, from the author of τ-bench. https://ysymyth.github.io/The-Second-Half/#ai So what’s suddenly different now? In three words: RL finally works. More precisely: RL finally generalizes. After several major detours and a culmination of milestones, we’ve landed on a working recipe to solve a wide range of RL tasks using language and reasoning. The second half of AI — starting now — will shift focus from solving problems to defining problems. In this new era, evaluation becomes more important than training. Instead of just asking, “Can we train a model to solve X?”, we’re asking, “What should we be training AI to do, and how do we measure real progress?” To thrive in this second half, we’ll need a timely shift in mindset and skill set, ones perhaps closer to a product manager. It turned out the most important part of RL might not even be the RL algorithm or environment, but the priors, which can be obtained in a way totally unrelated from RL (LLMs).