YouTube03 Feb 2025
5h 6m

DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters | Lex Fridman Podcast #459

Podcast cover

Lex Fridman

This interview podcast features Lex Fridman discussing China's DeepSeek AI models with Dylan Patel and Nathan Lambert. The conversation begins with an explanation of DeepSeek V3 and R1, focusing on their open-weight nature and the differences between instruction models and reasoning models. The discussion then delves into the technical aspects of pre-training and post-training, including mixture-of-experts models and the cost-effectiveness of DeepSeek's approach. Specific details on DeepSeek's low training and inference costs are provided, along with an analysis of the hardware used and the geopolitical implications of open-weight models. The podcast concludes with a discussion of the future of AI, including the potential for a technological Cold War and the role of human oversight in the development of increasingly autonomous AI systems. A key takeaway is the significant cost reduction achieved by DeepSeek in AI model training and inference, achieved through architectural innovations like mixture-of-experts models and multi-head latent attention.

Outlines

Part 1: Introduction, DeepSeek Models

Part 2: Training, Hardware, Geopolitics

Part 3: Inference, TSMC, and GPU Tech

Part 4: Open Source, Ethics, and Future

Part 5: Infrastructure, Competition, and AI Agents

Part 6: Open Source Initiatives, Future Tech, Conclusion

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval