Sholto Douglas
๐ค SpeakerAppearances Over Time
Podcast Appearances
MARK MANDELMANN- So all this time we're talking about, oh, it's only good at things that's been RL'd.
Well, it's pretty good at that, because that is a mixture of science and understanding language and coding.
There's this sort of mixture of domains here, all of which you need to understand.
You need to be both a great software engineer
and be able to think through language and state of mind and what's philosophized in some respects to be an interp agent.
And it is generalizing from the training to do that.
You can also come back to like the World War II question.
You can think of it as like a hierarchy of abstractions of trust here.
Where like let's say you want to go and talk to Churchill.
It helps a lot if you can verify that in that conversation, in that 10 minutes, he's being honest.
And this like enables you to construct better metanarratives of what's going on.
And so maybe particle physics wouldn't help you there.
But certainly like the neuroscience of Churchill's brain.
would help you verify that he was being trustworthy in that conversation and that the soldiers on the front lines were being honest in their description of what happened and this kind of stuff.
So long as you can verify parts of the tree up, then that massively helps you build confidence.
Or if they're like a thousand times less efficient than humans are learning.
That's right.
And you just like deploy them even still.
And I do think it's worth pressing on that future of, there is this whole spectrum of crazy futures, but the one that I feel we're almost guaranteed to get, and this is almost a strong statement to make, is one where at the very least, you get drop-in white collar worker,
at some point in the next five years.