Jeff Dean

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So there is a huge amount of headroom there to say, OK, if we can make this thing more expensive but smarter, because we're like 100x cheaper than reading a paperback.

3198.524 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

We're like 10,000 times cheaper than talking to a customer support agent.

3212.138 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

We're like a million times or more cheaper than hiring a software engineer or talking to your doctor or lawyer.

3216.783 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Add computation and make it smarter.

3226.092 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

I think a lot of the takeoff that we're going to see in the very near future is of this form.

3230.439 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

We've been exploiting and improving pre-training a lot in the past and post-training.

3239.574 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Those things will continue to improve, but taking advantage of

3246.164 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

think harder at inference time is going to just be an explosion.

3251.152 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Well, we're working out the algorithms as we speak.

3345.62 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So I believe...

3348.124 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

You know, we'll see better and better solutions to this as these many more than 10,000 researchers are hacking at it, at Google.

3350.748 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

In general, transformers can use the sequence length as a batch during training, but they can't really in inference because when you're generating one token at a time.

3634.349 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So there may be different hardware and inference algorithms that we design for the purposes of being efficient at inference.

3645.731 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Yeah.

3656.392 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

Basically, the big model is being used as a verifier as opposed to a generator and verification you can do.

3701.108 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

As long as you can sync all of the parameters of the model across the different data centers and then accumulate all the gradients.

3782.256 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

So in the time it takes to do one step, you're pretty good.

3791.142 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

In practice it works.

3828.485 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

It was so pleasant to go from async to sync because your experiments are now replicable.

3829.928 View full episode →

Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

replicable like rather than like every like your result depend on like whether there was like a web crawler running on the same machine it's like one of your computers so i am so much happier running on like tpu pods i love asynchrony it just lets you scale to iphones and xbox or whatever yeah what if we could give you asynchronous but replicatable results

3835.938 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment