26 Jul 2023
54m

FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

Podcast cover

Latent Space: The AI Engineer Podcast

This podcast episode explored various topics at the intersection of machine learning and systems, including attention mechanisms, memory hierarchies, flash attention, the relationship between academia and industry, evaluation in AI, the hardware lottery, and open-source AI. The discussion highlighted the importance of understanding both algorithms and systems, considering memory efficiency and hardware compatibility, fostering collaboration between academia and industry, and encouraging open-source data sets and models to drive innovation and progress in the field.

Outlines

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval