Ross Douthat
๐ค SpeakerAppearances Over Time
Podcast Appearances
But from AI models.
And it just seems like
Tell me if I'm wrong about this.
In a world that has multiplying AI agents working on behalf of people, millions upon millions who are being given access to bank accounts, email accounts, passwords, and so on, you're just going to have...
essentially some kind of misalignment and a bunch of AI are going to decide, decide might be the wrong word, but they're going to talk themselves into taking down the power grid on the West coast or something.
Won't that happen?
And the scale of it.
And tell me if I'm misunderstanding the sort of technological reality here, right?
But
If you have AI agents that have been trained and officially aligned with human values, whatever those values may be, but you have millions of them operating in digital space and interacting with other agents, right?
Yes.
How fixed is that alignment?
To what extent can agents change and de-align in that context, right?
or in the future when they're learning more continuously.
So I'm actually a bit... See, to me, that seems like the terrain where it becomes just...
Again, not impossible to stop the end of the world, but impossible to stop sort of punctuated terrorist things.
One of the things that you've tried to do is literally write a constitution, a long constitution for your AI.
What is that?
So if you read the U.S.
Constitution, it doesn't read like that.