Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Noam Shazeer

👤 Person
692 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I'm like, this is great.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

When are we going to launch it?

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And he's like, oh, well, we can't launch this.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

It's not really very practical because it takes 12 hours to translate a sentence.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I'm like, well, that seems like a long time.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

How could we fix that?

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So it turned out they had not really designed it for high throughput, obviously.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so it was doing like 100,000 disk seeks in a large language model that they'd sort of computed statistics over.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I wouldn't say train, really.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And, you know, for each word that it wanted to translate.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So, like, obviously doing 100,000 disk seeks is not super speedy.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But I said, okay, well, let's dive into this.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so I spent about two or three months with them designing an in-memory compressed representation of n-gram data.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And we were using – an n-gram is basically statistics for how often every n-word sequence occurs in a large corpus.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So you basically have – in this case, we had like two trillion words.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And most n-gram models of the day were like using two grams or maybe three grams.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But we decided we would use five grams.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So how often every five-word sequence occurs in basically as much of the web as we could process in that day.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And then you have a data structure that says, okay, I really like this restaurant occurs 17 times in the web or something.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so I built a data structure that would let you store all those in memory on 200 machines and then have sort of a batched API where you could say, here are the 100,000 things I need to look up in this round for this word, and it would give you them all back in parallel.