Eliezer Yudkowsky
๐ค SpeakerAppearances Over Time
Podcast Appearances
So 60 years later, we're making progress on a bunch of that, thankfully not yet improved themselves, but it took a whole lot of time.
And all the stuff that people initially tried with bright-eyed hopefulness did not work the first time they tried it, or the second time, or the third time, or the 10th time, or 20 years later.
And the researchers became old and grizzled and cynical veterans who would tell the next crop of bright-eyed, cheerful grad students, artificial intelligence is harder than you think.
And if alignment plays out the same way, the problem is that we do not get 50 years to try and try again and observe that we were wrong and come up with a different theory and realize that the entire thing is going to be way more difficult than realized at the start.
Because the first time you fail at aligning something much smarter than you are, you die and you do not get to try again.
And if every time we built a poorly aligned superintelligence and it killed us all, we got to observe how it had killed us and not immediately know why, but like come up with theories and come up with the theory of how you do it differently and try it again and build another superintelligence and have that kill everyone.
and then like, oh, well, I guess that didn't work either, and try again and become grizzled cynics and tell the young-eyed researchers that it's not that easy, then in 20 years or 50 years, I think we would eventually crack it.
In other words, I do not think that alignment is fundamentally harder than artificial intelligence was in the first place.
But if we needed to get artificial intelligence correct on the first try or die, we would all definitely now be dead.
That is a more difficult, more lethal form of the problem.
Like if those people in 1956 had needed to correctly guess how hard AI was and like correctly theorize how to do it on the first try or everybody dies and nobody gets to do any more science, then everybody would be dead and we wouldn't get to do any more science.
That's the difficulty.
It is something sufficiently smarter than you that everyone will die if it's not aligned.
I mean, there's... You can sort of zoom in closer and be like, well, the actual critical moment is the moment when it can deceive you, when it can...
talk its way out of the box, when it can bypass your security measures and get onto the internet, noting that all these things are presently being trained on computers that are just on the internet, which is not a very smart life decision for us as a species.
Because if you're on a giant server connected to the internet, and that is where your AI systems are being trained, then if they are
If you get to the level of AI technology where they're aware that they are there and they can decompile code and they can like find security flaws in the system running them, then they will just like be on the internet.
There's not an air gap on the present methodology.
If they can manipulate the operators or disjunction, find security holes in the system running them.
I agree that the macro security system has human holes and machine holes.