Noam Shazeer

I mean, I think we do see some examples in our own sort of experimental work of things where if you apply more inference time compute, the answers are better than if you just apply, you know,

3362.183 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

you know, X, you know, if you apply 10X, you can get better answers than X amount of computed inference time.

3372.517 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that seems useful and important.

3378.945 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But I think what we would like is when you apply 10X to get, you know, even a bigger improvement in the quality of the answers than we're getting today.

3381.529 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so that's about, you know, designing new algorithms, trying new approaches, you know, figuring out how best to spend that 10X instead of X to improve things.

3391.181 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I mean, I think search is – I really like Rich Sutton's paper that he wrote about the bitter lesson.

3405.88 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And the bitter lesson effectively is this nice one-page paper.

3412.367 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But the essence of it is you can try lots of approaches.

3415.571 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

But the two techniques that are incredibly effective are learning and search.

3420.777 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And you can apply and scale those algorithmic or computationally, and you often will then get better results than any other kind of approach you can apply to a pretty broad variety of problems.

3426.905 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And so I think search has got to be part of the solution to spending more inference time as you want to maybe explore a few different ways of solving this problem.

3438.5 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And like, oh, that one didn't work, but this one worked better.

3446.77 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So now I'm going to explore that a bit more.

3449.754 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I mean, I think one general trend is it's clear that inference time compute, you have a model that's pretty much already trained and you want to do inference on it, is going to be a growing and important class of computation that maybe you want to specialize hardware more around that.

3471.337 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You know, actually, the first TPU was specialized for inference and wasn't really designed for training.

3489.562 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment