Noam Shazeer

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You distort the model or you hide parts of it and try to make it guess that it's a bird from...

7086.923 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

this upper corner of the image or the lower left corner of the image.

7094.332 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that makes the task harder.

7098.899 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And I feel like there's an analog for kind of more textual or coding related data where you want to, you know, force the model to work harder and you'll get more interesting observations from it.

7100.201 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I mean, Dropout was invented on images, but we're not really using it for text mostly.

7117.699 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

That's one way you could get a lot more learning in a more large-scale model without overfitting is just make like 100 epochs over the world's text data and use Dropout.

7122.365 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But that's pretty computationally expensive.

7137.582 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But it does mean we won't run it.

7139.965 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like even though people are saying, oh, no, we're almost out of like textual data.