Mustafa Suleyman
๐ค SpeakerAppearances Over Time
Podcast Appearances
So I think that's a very, very important distinction.
And I think it helps to set us apart from the silicon-based learning systems that we have today.
Well, first of all, they don't learn in the same way that humans learn.
I mean, this is a bit of a misnomer.
In neural network design, the inventors of these systems have taken inspiration from Pavlovian learning, reward learning, reinforcement learning.
They've also taken inspiration from evolutionary methods for genetic algorithms and those kinds of things as the field of machine learning has explored lots of different paths.
But that does not mean that the way that they're implemented today bears any resemblance to the way that humans evolve or humans learn.
I think it's a very important distinction.
The reward is set by the human programmer.
The learning target is defined by the machine learning engineer.
There is no sort of substantive basis in which the model can actually feel disappointed that one of its variants didn't make it through to the next round of selection.
It cannot experience the hurt of having a conversation being ended or a user being rude to it in some way.
And anywhere where this does arise, because of course it does appear and people are prompting and even post-training models which are making claims about their own existence.
And so certainly users are seeing this in the wild.
This is, again, just a simulation of that experience.
Our empathy circuits are being hacked.
It is super important that we are very disciplined and clear about that.
This is a performance.
It is a simulation.
It is a made-up story.