Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

πŸ‘€ Speaker
3433 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But it doesn't have esoteric knowledge, you know?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Yeah, I don't know that I have a super strong prediction.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I do think that the labs are just being practical.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

They have a flops budget and a cost budget.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And it just turns out that pre-training is not where you want to put most of your flops or your cost.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So that's why the models have gotten smaller, because they are a bit smaller, the pre-training stage is smaller, et cetera, but they make it up in reinforcement learning and all this kind of stuff, mid-training and all this kind of stuff that follows.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So they're just being practical in terms of all the stages and how you get the most bang for the buck.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So I guess like forecasting that trend, I think, is quite hard.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I do still expect that there's so much longing for it.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

That's my basic expectation.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Yeah.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And so I have a very wide distribution here.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Probably most part, yeah.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I expect the data sets to get much, much better because when you look at the average data sets, they're extremely terrible.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Like so bad that I don't even know how anything works, to be honest.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Like look at the average example in the training set.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Like factual mistakes, errors, nonsensical things.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Somehow when you do it at scale, the noise washes away and you're left with some of the signal.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

um so data sets will improve a ton it's just everything gets better so um our hardware um our all the kernels um all the kernels for running the hardware and maximizing what you get with the hardware you know so nvidia is slowly tuning the actual hardware itself tensor course and so on all that needs to happen and will continue to happen uh all the kernels will get better and utilize the chip to the max extent all the algorithms will probably improve over optimization architecture and just all the modeling components of how everything is done and what the algorithms are that we're even training with

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So I do kind of expect like a just very just everything.