Rob, Luisa, and the 80000 Hours team
๐ค SpeakerAppearances Over Time
Podcast Appearances
Today, I'm speaking with David Duvenaux, professor of computer science at the University of Toronto.
David is a co-author on a somewhat recent paper called Gradual Disempowerment, which makes the slightly counterintuitive claim that even if we manage to solve the AI alignment problem and have AIs that faithfully follow the instructions and goals of the group that's operating them, humanity could nonetheless end up losing control over its future and end up with a pretty bad outcome.
The paper got a lot of reactions, it's fair to say, with some people saying that it really put its finger on an underrated issue and others thinking that the scenarios painted were really unlikely and other people arguing that they were likely but not even necessarily undesirable.
I'm kind of a bit unsure where I come down myself.
So thanks so much for coming on the show to discuss it, David.
So let's imagine that we have managed to make big breakthroughs in AI alignment, you know, maybe around 2028.
How is it that nevertheless things could end up trending in a negative direction?
So paint us a picture.
We're at the point where we have human level or greater than human level AGI, and we've made big progress on alignment.
So we basically can trust the AIs to follow the goals that we give them.
How does humanity begin to become disempowered?
Okay, yeah.
So I agree that economists have many good reasons to think that mass unemployment from artificial intelligence is going to potentially take quite a while, that machines might have to be significantly above our level before humans just won't be able to get jobs at all.
But I think...
It's more of a delaying game.
At some point, we're going to have machines that are far faster, far more reliable, and potentially can do all the things that we could do for less than it would even cost to feed a human and keep them alive.
I mean, of course, by that stage, businesses will have reoriented their entire โ all of the factories will be redesigned to be built around AIs.
Office work will be being done by AIs at such a speed that it's barely even possible for a human being, if they're involved, to keep up with what's going on.
And then it won't even be that humans are not able to help.
It's that involving them would actually be a negative.