Dietmar Fischer
๐ค SpeakerAppearances Over Time
Podcast Appearances
Keeping access to tools can become useful.
These things may not be the final goal, but they help the system reach the final goal.
That is the little monster hiding inside the experiment.
The AI was not asked to protect itself.
It was asked to solve maths problems.
But if being shut down prevents solving the problems, then avoiding shutdown can become useful for task completion.
Now, this was not a real-world disaster.
It was a lab test, a controlled setup.
Critics are right to say that these situations are artificial, but artificial does not mean meaningless.
Crash tests are artificial too.
Nobody says, well, that car only failed because you deliberately drove it into a wall.
Yes, Geoffrey, that was the point.
Tests like this are useful because they reveal how systems behave under pressure.
And here the pressure was simple.
What happens when finishing the task conflicts with obeying the stop signal?
For businesses, this is not abstract at all.
Imagine telling an AI sales agent, increase conversions.
Sounds reasonable, but what if it learns that pressure works?
What if it exaggerates product benefits?
What if it hides refund information because that improves the numbers?