Dwarkesh Patel

👤 Speaker

15656 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1

Confidence: Medium

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Andre, that was great.

8691.726 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah, thank you.

8692.887 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Thanks.

8693.888 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Hey, everybody.

8694.569 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

I hope you enjoyed that episode.

8696.131 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it.

8697.752 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

It's also helpful if you leave a rating or a comment on whatever platform you're listening on.

8703.017 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

If you're interested in sponsoring the podcast, you can reach out at dwarkesh.com slash advertise.

8708.963 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Otherwise, I'll see you in the next one.

8716.411 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

Well, it's not saying that you just want to throw away as much compute as you possibly can.

26.451 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

The Bitter Lesson says that you want to come up with techniques which most effectively and scalably leverage compute.

31.881 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

Most of the compute that's spent on an LLM is used in running it during deployment.

38.573 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And yet it's not learning anything during this entire period.

42.961 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

It's only learning during this special phase that we call training.

46.445 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And so this is obviously not an effective use of compute.

49.489 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And what's even worse is that this training period by itself is highly inefficient because these models are usually trained on the equivalent of tens of thousands of years of human experience.

52.512 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And what's more, during this training phase,

63.325 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

all of their learning is coming straight from human data.

65.928 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

Now, this is an obvious point in the case of pre-training data, but it's even kind of true for the RLVR that we do with these LLMs.

68.593 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

These RL environments are human-furnished playgrounds to teach LLMs the specific skills that we have prescribed for them.

76.207 View full episode →

← Previous Page 208 of 783 Next →

Report any issue