Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Noam Shazeer

👤 Person
692 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Could it attend to all your personal information for you?

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I would love a model that has access to all my emails and all my documents and all my photos.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And when I ask it to do something, it can sort of make use of that with my permission to sort of help solve what it is I'm wanting it to do.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But that's going to be a big computational challenge because the naive attention algorithm is quadratic.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And you can kind of barely make it work on a fair bit of hardware for millions of tokens, but there's no hope of making that just naively go to trillions of tokens.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So we need a whole bunch of interesting algorithmic approximations to what you would really want to make a way for the model to attend kind of conceptually to, you know, lots and lots of more tokens, trillions of tokens, and attend to your tokens.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You know, maybe we can put all of the Google code base in context for every Google developer, all the world's source code in context for any open source developer.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

That would be amazing.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

It would be incredible.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah, you take a word and you blow it up to 10 kilobytes or something.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Oh, to be clear, we have actually already done further training on a Gemini model on our internal code base for our internal developers.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But that's different than attending to all of it.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

because it sort of stirs together the code base into a bunch of parameters.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And I think having it in context makes things clearer.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But even the sort of further trained model internally is incredibly useful.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like Sundar, I think, has said that 25% of the characters that we're checking into our code base these days are generated by our AI-based coding models with kind of human kind of...

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I mean, I think one of the...

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

in addition to kind of researchy context, like anytime you're seeing these models used, I think they're able to make software developers more productive because they can kind of take sort of a high-level spec or sentence description of what you want done and give a pretty approximate, you know, pretty reasonable first cut at that.

Dwarkesh Podcast
Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so from a research perspective, maybe you can say, I'd really like you to explore this kind of idea similar to the one in this paper, but maybe let's try making it convolutional or something.