Noam Shazeer

👤 Speaker

See mentions of this person in podcasts

692 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

How does that flow work when you kind of

3612.643 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

need a bit more information and then you want to put it back in the background for it to continue doing you know finding the hotels in brilliant or whatever um i think it's going to be pretty interesting uh and inference will be useful difference will be useful i mean there's also a compute efficiency thing in inference that you don't have in training and that

3616.428 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah, like as a good example of an algorithmic improvement is like the use of drafter models.

3656.472 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So you have like a really small language model that you do one token at a time when you're decoding and predict like four tokens.

3663.282 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And then you give that to the big model and you say, okay, here's the four tokens the little model came up with.

3670.913 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

check which ones you agree with and if you agree with the first three then you just advance and then you've basically been able to do a four four token with parallel computation instead of a one token with thing in the big model um and so those are the kinds of things that people are looking at to improve inference efficiency

3677.482 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So you don't have this single token decode bottleneck.

3696.922 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Hello, how are you?

3707.197 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

That sounds great to me.

3708.178 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I'm going to advance past that.

3709.4 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I mean, we're already doing it.

3745.685 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So we're pro multi-data center training.

3746.686 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I think in the Gemini 1.5 tech report, we said we used multiple metro areas and trained with some of the compute in each place and then a pretty –

3749.55 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

long latency, but high bandwidth connection between those data centers.

3760.203 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

And that works fine.

3763.828 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

It's great.

3765.71 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Actually, training is kind of interesting because each step in a training process is, you know, usually for a large model is a few seconds or something at least.

3766.051 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So the latency of it being, you know, 50 milliseconds away doesn't matter that much.

3773.961 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Just the bandwidth.

3779.508 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah, just bandwidth.

3781.193 View full episode →

← Previous Page 16 of 35 Next →

Report any issue