Dwarkesh
👤 PersonAppearances Over Time
Podcast Appearances
It seems like humans have some solution, but I'm curious about like, well, how are they doing it?
And like, why is it so hard to like, well, how do we need to reconceptualize the way we're training models to make something like this possible?
You know, that is a great question to ask.
Nobody listens to this podcast, Ilya.
Yeah.
So I have to say that prepping for Ilya was pretty tough because neither I nor anybody else had any idea what he's working on and what SSI is trying to do.
I had no basis to come up with my questions.
And the only thing I could go off, honestly, was trying to think from first principles about what are the bottlenecks to HEI?
Because clearly Ilya is working on them in some way.
Part of this question involved thinking about RL scaling because everybody's asking how well RL will generalize and how we can make it generalize better.
As part of this, I was reading this paper that came out recently on RL scaling and it showed that actually the learning curve on RL looks like a sigmoid.
I found this very curious.
Why should it be a sigmoid where it learns very little for a long time and then it quickly learns a lot and then it asymptotes?
This is very different from the power law you see in pre-training where the model learns a bunch at the very beginning and then less and less over time.
And it actually reminded me of a note that I had written down after I had a conversation with a researcher friend where he pointed out that the number of samples that you need to take in order to find a correct answer scales exponentially with how different your current probability distribution is from the target probability distribution.
And I was thinking about how these two ideas are related.
I had this vague idea that they should be connected, but I really didn't know how.
I don't have a math background, so I couldn't really formalize it.
But I wondered if Gemini 3 could help me out here.
And so I took a picture of my notebook and I took the paper and I put them both in the context of Gemini 3 and I asked it to find the connection.