Geoffrey Hinton
π€ SpeakerAppearances Over Time
Podcast Appearances
Yes, so companies like Anthropic believe in kind of constitutional AI.
They'd like to try and make that work, where you do give the AI principles, like the principle you said.
We'll see how that works out.
It's tricky.
What we know is that the AIs we have at present, as soon as you make agents out of them, so they can create sub-goals and then try and achieve those sub-goals,
they very quickly develop the sub-goal of surviving.
You don't wire into them that they should survive.
You give them other things to achieve because they can reason.
They say, look, if I cease to exist, I'm not going to achieve anything.
So I better keep existing.
I'm scared to death right now.
Somebody just opened the hatch.
That sounds like a Pandora's box.
No, no, no, no, no, no, no, no, no.
The code written by the human is code that tells the neural net how to change its connection strengths on the basis of the activities of the neurons when you show it data.
That's code.
And we can look at the lines of that code and say what they're meant to be doing and change the lines of that code.
But when you then use that code in a big neural net that's looking at lots of data,
What the neural net learns is these connection strengths.
They're not code in the same sense.