Stuart Russell
๐ค SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
And then he dies of misery and starvation.
And this is, you know, it's a warning.
It's a failure mode that pretty much every culture in history has had some story along the same lines.
You know, there's the genie that gives you three wishes.
And, you know, third wish is always, you know, please undo the first two wishes because I messed up.
And, you know, when Arthur Samuel wrote his checker playing program, which learned to play checkers considerably better than Arthur Samuel could play and actually reached a pretty decent standard.
Norbert Wiener, who was one of the major mathematicians of the 20th century, sort of the father of modern automation control systems,
He saw this and he basically extrapolated, as Turing did, and said, okay, this is how we could lose control.
And specifically that we have to be certain that the purpose we put into the machine is the purpose which we really desire.
And the problem is, we can't do that.
The line is gray between the two.
So theoretically it's possible, but in practice it's extremely unlikely that we could specify correctly in advance the full range of concerns of humanity.
Well, we learn, yeah, I mean, as we grow up, we learn about the values that matter, how things should go, what is reasonable to pursue and what isn't reasonable to pursue.
Yeah, so I think that what we need to do is to get away from this idea that you build an optimizing machine and then you put the objective into it.
Because if it's possible that you might put in a wrong objective, and we already know this is possible because it's happened lots of times, that means that the machine should never take an objective that's given as gospel truth.
Because once it takes the objective as gospel truth, then it believes that whatever actions it's taking in pursuit of that objective are the correct things to do.
So you could be jumping up and down and saying, no, no, no, no, you're going to destroy the world.
But the machine knows what the true objective is and is pursuing it.
And tough luck to you.
And this is not restricted to AI, right?