10 Mar 2026
1h 23m

NVIDIA's AI Engineers: Agent Inference at Planetary Scale and "Speed of Light" — Nader Khalil (Brev), Kyle Kranen (Dynamo)

Podcast cover

Latent Space: The AI Engineer Podcast

The Latent Space podcast features Nader Khalil and Kyle Kranen from NVIDIA, discussing developer experience, GPU technology, and the company's internal culture. They recount Brev's acquisition by NVIDIA, emphasizing the shared goal of simplifying developer access to GPUs, and touch on NVIDIA's developer experience initiatives. The conversation explores Dynamo, a data center scale inference engine, detailing its role in optimizing inference at scale through techniques like disaggregation, prefill, and decode. They also discuss "SOL" (Speed of Light) as a concept for creating urgency and understanding theoretical limits, and explore the balance between stability and innovation. The podcast further examines the evolution of AI, the importance of hardware-model co-design, and the potential of agents in coding and business applications.

Outlines

Part 1: Security, Acquisition, and Culture

Part 2: Developer Strategy and GPU Access

Part 3: Innovation and Market Creation

Part 4: Scaling Inference and Dynamo

Part 5: Hardware-Model Co-Design

Part 6: Agents in Production

Part 7: Coding Agents and CLIs

Part 8: Future Trends and Community

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval