Joe Carlsmith
👤 PersonAppearances Over Time
Podcast Appearances
But I actually think it's like possible we messed that up too.
You know, it's like kind of an, it's an intense project writing like kind of constitutions and like structures of, of rules and stuff that are going to be robust to very intense forms of optimization.
So that's, that's a final one that I'll just flag, which I think is like, um, uh,
it comes up even if you've sort of solved all these other problems.
Yeah, totally.
I'm not trying to say, like... Mostly the thing I wanted to do there was just give any... Sure.
Like, giving some sense of, like, what might the model's motivations be?
Like, what are ways this could be?
I mean, as I said, my...
my best guess is that it's partly the like alien thing.
And, you know, not necessarily, but the, but insofar as you were, you know, also interested in like, what does the model do later?
And kind of like how, what sort of future would you expect if models did take over?
Then, yeah, I think it can at least be helpful to have some like set of hypotheses on the table instead of just saying like, it has some set of motivations.
But in fact, I am like, a lot of the work here is being done by our ignorance about what those motivations are.
You know, my best guess when I really think about what do I feel good about, and I think this is probably true of a lot of people, is there's some sort of more organic...
decentralized process of like civilizational, incremental civilizational growth.
The type of thing we trust most and the type of thing we have most experience with right now as a civilization is some sort of like, okay, we change things a little bit.
A lot of people have, there's a lot of like processes of adjustment and reaction and kind of a decentralized sense of like what's changing.
You know, was that good?
Was that bad?