31 Jul 2025
1h 18m

The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)

Podcast cover

Latent Space: The AI Engineer Podcast

In this episode of the Latent Space Podcast, Alessio and Swyx host Nathan Lambert from AI2 to discuss recent advancements and challenges in AI, particularly focusing on reasoning models, reinforcement learning, and tool use. They delve into topics such as the Tulu project, RLVR (Reinforcement Learning with Verifiable Rewards), and the importance of data and training methodologies. The conversation explores the nuances of hybrid reasoning models versus reasoning-only models, the role of search in AI, and the concept of overoptimization. They also touch on the potential of open models, the challenges of creating effective evaluations, and future directions for AI research, including character training and model routing.

Outlines

Part 1: Introduction and RLVR Genesis

Part 2: Model Evolution and Search

Part 3: Reasoning Models and Overoptimization

Part 4: Model Training and Future Outlook

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval