Noam Shazeer

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

They don't sort of go off and sort of have lots of different branches for mathy things that don't merge back together with the kind of CAD image thing.

5690.859 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And I think we should probably have a more organic structure in these things.

5700.11 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I also would like it if the pieces of the model could be developed a little bit independently.

5705.036 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah.

5711.885 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like right now, I think we have this issue where we're going to train a model.

5711.965 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So we do a bunch of preparation work on deciding the most awesome algorithms we can come up with and the most awesome data mix we can come up with.

5715.45 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But there's always trade-offs there.

5723.842 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like we'd love to include more multilingual data, but that might come at the expense of including less coding data.

5725.424 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so the model's less good at coding, but better multilingual or vice versa.

5731.993 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And I think it would be really great if we could have

5736.88 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

like a small set of people who care about a particular subset of languages go off and create really good training data, train a modular piece of a model that we can then hook up to a larger model that improves its capability in, say, Southeast Asian languages or in reasoning about languages

5739.657 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Haskell code or something and then you then also have a nice software engineering benefit where you've decomposed the problem of it compared to what we do today which is we have this kind of a whole bunch of people working but then we have this kind of monolithic process of starting to do pre-training on this model and if we could do that

5762.928 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You could have 100 teams around Google, you could have people all around the world working to improve languages they care about or particular problems they care about and all collectively work on improving the model.

5783.009 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that's a kind of a form of continual learning.

5795.277 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah, I think there may be ways to get a lot of the benefits of that with kind of a version system of modularity.

5863.025 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like I have a frozen version of my model.

5869.379 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And then I include a different variant of some particular module and I want to compare its performance or train it a bit more.

5872.025 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And then I compare it to the baseline of this thing with, you know, now version, you know, N prime of this particular module that does Haskell interpretation.

5878.693 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And also more, more parallelizable, I think.

5908.237 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah.

5911.744 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment