Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Dwarkesh Patel

πŸ‘€ Person
12212 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It's something much more deliberate and rich is happening.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

What is the ML analogy and how does that compare to what we're doing with other ones right now?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But you're so good at coming up with evocative phrases.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Sucking supervision through a straw is, like, so good.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Why hasn'tβ€”so you're saying, like, your problem with outcome-based reward is that you have this huge trajectory, and then at the end, you're trying to learn every single possible thing about what you should do and what you should learn about the world from that one final bit.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Why hasn'tβ€”given the fact that this is obviousβ€”why hasn't process-based supervisionβ€”

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

as an alternative been a successful way to make models more capable?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

What has been preventing us from using this alternative paradigm?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

You're basically training the LLM to be a prompt injection model.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So to the extent you think this is the bottleneck to making RL more functional, then that will require making LLMs better judges if you want to do this in an automated way.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And then so is it just going to be like some sort of GAN-like approach where you had to train models to be more robust?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Interesting.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Do you have some shape of what the other idea could be?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Yeah.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So I guess I see a very, not easy, but like I can conceptualize how you would be able to train on synthetic examples or synthetic problems that you have made for yourself.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But there seems to be another thing humans do, maybe sleep is this, maybe daydreaming is this, which is not necessarily come up with fake problems, but just like reflect.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Yeah.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And I'm not sure what the ML analogy for, you know, daydreaming or sleeping, but just like just reflecting.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I haven't come up with a new problem.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Yeah, yeah.