Joe Carlsmith
👤 PersonAppearances Over Time
Podcast Appearances
That's a key aspect of the scenario, right?
And this is sometimes people will say this stuff.
They'll be like, well, the good AIs, there will be the good AIs, and they'll defeat the bad AIs.
But, you know, notice the assumption in there, which is that you sort of made it the case that you can control some of the AIs, right?
And you've got some good AIs.
And now it's a question of, like, are there enough of them?
And how are they working relative to the others?
And maybe, you know, I think it's possible that that is what happens.
There's, you know, we know enough about alignment that some actors are able to do that.
And maybe some actors are less cautious or they are intentionally creating misaligned AIs or God knows what.
But if you don't have that, if everyone is, in some sense, unable to control their AIs, then...
there's then the sort of the good AIs help with the bad AIs thing becomes like more complicated or maybe it just doesn't work because there's sort of no good AIs in this scenario.
There's a lot of sort of, if you say like everyone is building their own super intelligence that they can't control.
It's true that that is now a check on the power of the other super intelligence.
Now the other super intelligences need to like deal with other actors, but none of them are necessarily kind of working on behalf of AIs
a given set of human interests or anything like that.
So I do think that's like a very important difficulty in thinking about sort of the very simple thought of like, ah, I know what we can do.
Let's just have lots and lots of AIs so that no single AI has a ton of power.
And I think that on its own is not enough.
Yeah, I think there's very notable and salient sources of correlation between failures across the different runs, which is people didn't have a developed science of AI motivations.