Richard Sutton
👤 PersonAppearances Over Time
Podcast Appearances
Okay, so those two things.
And then there's also the perception component, which is the construction of your state representation, your sense of where you are now.
And the fourth one is what we're really getting at, most transparently anyway.
The fourth one is the transition model of the world.
That's why I am uncomfortable just calling everything models, because I want to talk about the model of the world.
the transition model of the world your belief that if you do this what will happen what will be the consequences of what you do so your physics of the world but it's all it's not just physics it's also um abstract models like you know your model of how you traveled um from california up to edmonton for this podcast that was a model and that's a transition model and that would be uh learned and it's not learned from reward it's learned from you did things you saw what happened yeah you made that model of the world
That will be learned very richly from all the sensation that you receive, not just from the reward.
It has to include the reward as well, but that's a small part of the whole model, small crucial part of the whole model.
The idea is totally general.
I do use all the time, as my canonical example, the idea of an AI agent is like a person.
And people, in some sense, they have just one world they live in.
And that world may involve chess and it may involve Atari games.
But those are not a different task or a different world.
Those are different states that they encounter.
Right.
And so the general idea is not limited at all.
They just set it up.
It was not their ambition to have one agent across those games.
If we want to talk about transfer, we should talk about transfer, not across games or across tasks, but transfer between states.
We're not seeing transfer anywhere.