Ilya Sutskever
👤 PersonAppearances Over Time
Podcast Appearances
And that's very...
But what does it say about the role of our built-in emotions in making us like a viable agent, essentially?
And I guess to connect to your question about pre-training, it's like, maybe if you're good enough at getting everything out of pre-training, you could get that as well.
But that's the kind of thing which seems...
Well, it may or may not be possible to get that from pre-training.
It should be some kind of a value function thing.
Yeah.
But I don't think there is a great ML analogy because right now value functions don't play a very prominent role in the things people do.
I mean, certainly.
I'll be very happy to do that.
Right?
So...
So when people do reinforcement learning, the very reinforcement learning is done right now.
How do people train those agents?
So you have a neural net, and you give it a problem.
And then you tell the model, go solve it.
And the model takes maybe thousands, hundreds of thousands of actions, or thoughts, or something, and then it produces a solution, the solution is created.
And then the score is used to provide a training signal for every single action
in your trajectory.
So that means that if you're doing something that goes for a long time, if you're training a task that takes a long time to solve, you will do no learning at all until you solve until you come up with a proposed solution.