Stuart Russell
๐ค SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
And if you're giving a machine an objective which isn't aligned with what we truly want the future to be like, you're actually setting up a chess match.
And that match is one that you're going to lose when the machine is sufficiently intelligent.
And so that's problem number one.
Problem number two is that the kind of technology we're building now, we don't even know what its objectives are.
So it's not that we're specifying the objectives, but we're getting them wrong.
We are growing these systems.
They have objectives, but we don't even know what they are because we didn't specify them.
What we're finding through experiment with them is that
They seem to have an extremely strong self-preservation objective.
What do you mean by that?
You can put them in hypothetical situations.
Either they're going to get switched off and replaced, or they have to allow someone, let's say someone has been locked in a machine room that's kept at three centigrade, so they're going to freeze to death.
They will choose to leave that guy locked in the machine room.
And die rather than be switched off themselves.
Someone's done that test.
Yeah.
Yep.
Well, they put them in these hypothetical situations and they allow the AI to decide what to do.
And it decides to preserve its own existence, let the guy die, and then lie about it.
19.