Dwarkesh Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
And when intrepid researchers do try to piece together the implications from scarce public data points, they get pretty bearish results.
For example, Toby Board has a great post where he cleverly connects the dots between the different O-series benchmarks.
And this suggested to him that, quote, we need something like a million X scale up in total RL compute to give a boost similar to a single GPT level, end quote.
So people have spent a lot of time talking about the possibility of a software-only singularity, where AI models will write the code that generates a smarter successor system, or a software plus hardware singularity, where AIs also improve their successor's computing hardware.
However, all these scenarios neglect what I think will be the main driver of further improvements atop AGI, continual learning.
Again, think about how humans become more capable than anything.
It's mostly from experience in the relevant domain.
Over conversation, Baron Millage made this interesting suggestion that the future might look like continual learning agents who are all going out and they're doing different jobs and they're generating value.
And then they're bringing back all their learnings to the hive mind model, which does some kind of batch distillation on all of these agents.
The agents themselves could be quite specialized, containing what Karpathy called the cognitive core, plus knowledge and skills relevant to the job they're being deployed to do.
Solving continual learning won't be a singular one-and-done achievement.
Instead, it will feel like solving in-context learning.
Now, GPT-3 already demonstrated in-context learning could be very powerful in 2020.
Its in-context learning capabilities were so remarkable, the title of the GPT-3 paper was Language Models Are Few-Shot Learners.
But of course, we didn't solve in-context learning when GPT-3 came out.
And indeed, there's still plenty of progress that still has to be made, from comprehension to context length.
I expect a similar progression with continual learning.
Labs will probably release something next year which they call continual learning, and which will in fact count as progress towards continual learning.
But human-level, on-the-job learning may take another five to ten years to iron out.
This is why I don't expect some kind of runaway gains from the first model that cracks continual learning, that's getting more and more widely deployed and capable.