EP 31 Dig into the research breakthroughs behind ChatGPT - transformers, self-attention and more - with one of the inventors: Noam Shazeer
“The best applications are things we have not thought of.”
ChatGPT has been all the rage in recent weeks. But underlying it are several key developments in AI - “transformers”, “self-attention”, LLMs and more.
This week we dive deep with Noam Shazeer, founder of Character.AI, Google veteran, and a key contributor to several key developments in AI including transformers, Mesh-Tensorflow, T5, and Google’s LaMDA dialog system.
We cover: the evolution of Google and AI, transformers, LLMs, neural networks, commercialization of Google’s research, future of ChatGPT and AI, engineering philosophy, his work at Character.AI, and much more!
Listen to the full conversation on:
Spotify:
Apple:
Youtube:
Liked the episode? Let us know!
Leave a comment, rating, or tag us on Twitter - we repost our favorite ones!
—
Where to find Noam Shazeer:
• Character.AI: https://beta.character.ai
• Google Scholar: Link
—
Where to find Us:
• Aarthi and Sriram’s Good Time Show: Youtube, Substack, Twitter
• Aarthi Ramamurthy: Twitter, Instagram
• Sriram Krishnan: Twitter, Instagram, Blog
—
Notable Quotes:
“Paul Buchheit asked me how I do a spell corrector and then I ended up writing the first good spell corrector at Google.”
“Larry Page one day decided that managers were bad and essentially made Google into a flat organization; two decades later, when you look at what Elon Musk is doing at Twitter, there are some shades of similarities, which is, let's cut out a lot of middle management and let's have the engineers do the things that they do best.” - Sriram
“I like to think of [the prediction process in neural networks] as a really talented improvisational actor.”
“The best applications are things we have not thought of.”
—
Referenced:
Character.ai - AI powered chatbot built by Noam Shazeer and Daniel De Freitas
Transformer research papers co-authored by Noam Shazeer
Paul Buchheit, creator of Gmail
Larry Page’s firing managers at Google to make it a flat organization
Large Language Models (LLMs)
Unsupervised methods: topic modeling and clustering
Google AdSense
A plan for spam by Paul Graham
What are neural networks? - article by IBM
Noam Shazeer’s talk at WeCNLP 2018 on NLP
Attention Is All You Need, research paper
Long Short-Term Memory, by Sepp Hochreiter, Jürgen Schmidhuber
Parallel Computing
What’s the difference between Attention and Self-Attention? By Angelina Yang
DALL.E by OpenAI
Training on TPU Pods
Google’s 20% rule
Parasocial Interactions
Github Copilot - AI pair programmer
—
In this episode, we cover:
[01:05] Breaking into Google
[05:00] Evolution of Google with Noam Shazeer
[09:43] History and Evolution of AI
[16:15] ELI5: What is a neural network?
[18:50] ELI5: What are LLMs?
[28:16] Engineering Philosophy
[31:19 ] Attention Is All You Need
[39:01] Why hasn’t Google productized much of their research?
[44:34] Character.ai
[50:38] Are tech giants slow?
[53:48] Future of ChatGPT & Character.ai
[01:00:16] Advice for AI startups
[01:03:06] What do humans want from AI?
I really love how you have notable quotes, references, and a break down of the interview with times.
So well organized…I just found you guys today and I’m happy I clicked the link.
Thank you so much I wish you nothing but continued success on your journey and look forward to learning more.