Deep-dive into DeepSeek (Practical AI #302)

This Practical AI podcast episode discusses DeepSeek R1, a large language model (LLM) from a Chinese startup, trained at significantly lower cost than comparable models. The hosts debate the implications of DeepSeek's performance and low cost, exploring narratives around its unexpected emergence and the open-source release of the model on Hugging Face. They address security and privacy concerns, differentiating between the model itself (which can be run securely offline) and the product (which collects user data). The discussion also delves into DeepSeek's technical architecture, including mixture-of-experts layers and distilled versions of the model, highlighting the accessibility of smaller versions for various users. The episode concludes by predicting a shift in the AI landscape towards model optionality and increased focus on data curation within enterprises.

Outlines

Sign in to continue reading, translating and more.

Continue

Changelog Master Feed

Introduction and Podcast Update

DeepSeek R1: Initial Reactions and Cost-Effectiveness

DeepSeek R1: Security and Privacy Concerns

DeepSeek R1: Technical Details, Model Versions, and Future Implications

Deep-dive into DeepSeek (Practical AI #302)

Changelog Master Feed

00:43Introduction and Podcast Update

Introduction and Podcast Update

03:45DeepSeek R1: Initial Reactions and Cost-Effectiveness

DeepSeek R1: Initial Reactions and Cost-Effectiveness

17:55DeepSeek R1: Security and Privacy Concerns

DeepSeek R1: Security and Privacy Concerns

33:04DeepSeek R1: Technical Details, Model Versions, and Future Implications

DeepSeek R1: Technical Details, Model Versions, and Future Implications