Sholto Douglas
๐ค SpeakerAppearances Over Time
Podcast Appearances
And yeah.
Yeah.
Yeah.
Another dinner party question.
Should we be less worried about misalignment and maybe that's not even the right word for what I'm referring to, but like just alienness and chagasness from these models, given that there is feature universality and there are certain ways of thinking and ways of understanding the world that are different.
instrumentally useful to different kinds of intelligences?
Should we just be less worried about like bizarro pay-per-click maximizers as a result?
It has a denser representation of regions that are particularly relevant to predicting the next token.
That particular example, I wonder if that implies that the...
Difficulty of doing interoperability on smarter models will be harder because if like it requires somebody with esoteric knowledge who just happened to see that Base64 has, I don't know, like whatever that distinction was.
Doesn't that imply when you have a million line pull request, it's like there is no human that's going to be able to decode like two different reasons why the pull request, there's like two different features for this pull request.
Yeah, you know what I mean?
And that's when you type a comment, like, small CLs, please.
thing between models where you have millions of features potentially for GPT-6, and a bunch of models are just trying to figure out what each of these features means.
Does that sound right?
Yeah.
I want to talk more about the feature splitting because I think that's an interesting thing that has been under-explored.
First of all, how do we even think about... Is it really just...
You can keep going down and down.
There's no end to the amount of features.