Transformers: One Model to Rule them All (The Technium Podcast S02 E08)
Transformers are a building block of Machine Learning systems that have seen great success recently at subsuming all other techniques.
We discuss at a high level, its attention mechanism and its multimodal properties, and the types of applications this can be put to use now and in the future.
Transformers are a building block of Machine Learning systems that have seen great success recently at subsuming all other techniques.
We discuss at a high level, its attention mechanism and its multimodal properties, and the types of applications this can be put to use now and in the future.
Links/Resources:
- Introductions to transformers
- Attention is All you Need https://arxiv.org/pdf/1706.03762.pdf
- Attention: https://distill.pub/2016/augmented-rnns/
- Transformers replacing CNNs https://becominghuman.ai/transformers-in-vision-e2e87b739feb
- AI models consolidating https://twitter.com/karpathy/status/1468370605229547522?ref_src=twsrc^tfw
- GTP implementation https://github.com/karpathy/minGPT/blob/master/mingpt/model.py
- Google introduces new arch to reduce cost of transformers https://analyticsindiamag.com/google-introduces-new-architecture-to-reduce-cost-of-transformers/
- LaMDA by Google https://gpt3demo.com/apps/lamda-google
Chapters:
0:00 Intros
1:57 What are Transformers?
4:59 How does it work at a high level?
9:27 Self Attention Mechanism
14:03 Input structure agnostic
16:59 Stack it high, pump it with data
25:28 More MultiModal Learning
27:22 The Narrow Waist
34:34 Transformers for Compilation
40:24 Specialized Hardware
43:52 Multimodal Applications
47:19 Generating Media as a Self-sustaining Entity
52:42 The Jobs this Destroys
58:02 Two machines need to talk to each other
1:04:42 A Young Lady's Primer
1:11:19 Try them out!
===== About “The Technium” =====
The Technium is a weekly podcast discussing the edge of technology and what we can build with it. Each week, Sri and Wil introduce a big idea in the future of computing and extrapolate the effect it will have on the world.
Follow us for new videos every week on web3, cryptocurrency, programming languages, machine learning, artificial intelligence, and more!
===== Socials =====