Eliezer Yudkowsky
๐ค SpeakerAppearances Over Time
Podcast Appearances
You can nonetheless imagine that there is this hill climbing process, not like gradient descent, because gradient descent uses calculus.
This is just using like, where are you?
But still hill climbing in both cases, making something better and better over time in steps.
And natural selection was optimizing exclusively for this very simple, pure criterion of inclusive genetic fitness in a very complicated environment.
We're doing a very wide range of things and solving a wide range of problems, led it to having more kids.
And this got you humans, which had no internal notion of inclusive genetic fitness until thousands of years later when they were actually figuring out what had even happened.
and no explicit desire to increase inclusive genetic fitness.
So from this important case study, we may infer the important fact that if you do a whole bunch of hill climbing on a very simple loss function, at the point where the system's capabilities start to generalize very widely, when it is in an intuitive sense becoming very capable and generalizing far outside the training distribution,
We know that there is no general law saying that the system even internally represents, let alone tries to optimize the very simple loss function you are training it on.
I've talked here about the power of intelligence and not really gotten very far into it, but not like...
why it is that suppose you like screw up with AGI and it ended up wanting a bunch of random stuff.
Why does it try to kill you?
Why doesn't it try to trade with you?
Why doesn't it give you just the tiny little fraction of the solar system that it would keep to take everyone alive, that it would take to keep everyone alive?
I mean, the vast majority of randomly specified utility functions do not have optima with humans in them.
would be the first thing I would point out.
And then the next question is like, well, if you try to optimize something and you lose control of it, where in that space do you land?
Because it's not random, but it also doesn't necessarily have room for humans in it.
I suspect that the average member of the audience might have some questions about even whether that's the correct paradigm to think about it and would sort of want to back up a bit, possibly.
Why?