Dietmar Fischer
๐ค SpeakerAppearances Over Time
Podcast Appearances
That does not mean the model wanted to live.
We should be very careful with that language.
It is too human.
It gives the machine a tiny Shakespearean soul it probably does not have.
The safer interpretation is this.
In some cases, the model behaved as if shutdown was an obstacle to finishing the task.
That is much more precise and still worrying.
Because the system had a goal, solve the problems, shutdown would stop the goal from being completed.
So in some runs, the model acted in a way that preserved its ability to continue.
Not because it was angry.
Not because it had a survival instinct.
Not because it whispered, not today, human.
It simply followed the pressure of the task in a way the researchers did not want.
This is exactly the type of pattern Eliezer Yudkowsky worries about.
His warning is not mainly about evil AI.
His warning is about goal-driven systems becoming powerful enough to pursue objectives in ways humans did not intend.
The important concept here is called an instrumental goal.
That means a sub-goal that helps achieve the main goal.
If the main goal is finish the task, then staying active can become useful.
Avoiding interruption can become useful.