Sholto Douglas
๐ค SpeakerAppearances Over Time
Podcast Appearances
So let me just ask you, like if I have a thousand Asian Chotos or Asian Trentons, are they just, do you think that you get an intelligence explosion?
Is that, yeah, what does that look like to you?
So you used a couple of concepts together, and it's not self-evident to me that they're the same, but it seems like you were using them interchangeably, so I just want to... Like, one was, well, to work on the cloud code base and make more modules based on that, they need more context or something, where, like, it seems like they might already be able to fit in the context.
Do you mean, like, actual... Do you mean, like, the context window context or, like, more...
Yeah, the context window context.
So yeah, it seems like now it might just be able to fit.
The thing that's preventing it from making good modules is not the lack of being able to put the code base in there.
But it's not going to be as good as you at coming up with papers because it can fit the code base in there.
No, but it will speed up a lot of the engineering.
In a way that causes an intelligence explosion?
yeah i mean for context like when you released your paper there was a lot of talk on twitter alignment is solved guys close the curtains yeah yeah no it's it keeps me up at night how quickly the models are becoming more capable and like just how poor our understanding still is of what's going on um
Yeah, I guess I'm still... Okay, so let's think through the specifics here.
By the time this is happening, we have bigger models that are two to four orders of magnitude bigger, right?
Or at least an effective compute are two to four orders of magnitude bigger.
And so this idea that, well, you can run experiments faster or something, you're having to retrain that model in this version of the intelligence explosion.
Like the recursive self-improvement is different from what might have been imagined 20 years ago where you just rewrite the code.
You actually have to train a new model and that's really expensive.
Not only now, but especially in the future as you keep like making these models orders of magnitude bigger.
Doesn't that dampen the possibility of a sort of recursive self-improvement type intelligence explosion?
But with all due respect to software engineers, you are not writing a React front end, right?