YouTube25 Sep 2025
1h 46m

Why AI evals are the hottest new skill for product builders | Hamel Husain & Shreya Shankar

Podcast cover

Lenny's Podcast

In this episode of Lenny's Podcast, Lenny Rachitsky interviews Hamel Husain and Shreya Shankar about evals, a systematic way to measure and improve AI applications. They discuss the importance of data analysis in identifying errors, categorizing them using AI, and creating LLM-as-judge prompts to automate the evaluation process. The conversation covers misconceptions about evals, the role of human judgment, and practical tips for implementing evals effectively, emphasizing that evals should be used to drive actionable improvements to AI products. They also touch on the debate around evals versus A/B testing, the significance of error analysis, and the need for a structured approach to application-specific evals.

Outlines

Part 1: Introduction to Evals

Part 2: Error Analysis and Data Synthesis

Part 3: Evals Debate and Misconceptions

Part 4: Evals Course and Final Thoughts

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval