Noam Shazeer

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that should vary by like factors of 10,000 for really easy things and really hard things, maybe even a million.

6969.91 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And it might be iterative.

6976.161 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like you might make a pass through the model, get some stuff, and then decide you now need to call on some other parts of the model as another aspect of it.

6978.566 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

The other thing I would say is like this sounds super complicated to deploy because it's like this weird, you know, constantly evolving thing with maybe not super optimized ways of communicating between pieces.

6986.5 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But you can always distill from that, right?

7001.312 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like so if you say this is the kind of task I really care about.

7003.256 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

let me distill from this giant kind of like organic-y thing into something that I know can be served really efficiently.

7006.463 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And you could do that distillation process, you know, whenever you want, once a day, once an hour.

7013.914 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that seems like it'd be kind of good.

7018.7 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah.

7037.986 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

A related thing is I feel like we need interesting learning techniques during pre-training.

7038.787 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like I'm not sure we're extracting the maximal value from every token we look at with the current training objective, right?

7044.813 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Like maybe we should think a lot harder about some tokens.

7051.461 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You know, when you get to the answer is maybe the model should at training time do a lot more work than...

7054.864 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

when it gets to the.

7063.214 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah, every which way.

7074.369 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Hide some stuff this way, hide some stuff that way.

7075.33 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Make it infer from partial information.

7078.614 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

These kinds of things.

7081.637 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I think people have been doing this in vision models for a while.

7083.459 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment