Richard Sutton
👤 PersonAppearances Over Time
Podcast Appearances
We're not seeing general... Critical to good performance is that you can generalize well from one state to another state.
We don't have any methods that are good at that.
What we have are people try different things and they settle on something that a representation that transfers well or that generalizes well.
But we don't have any automated techniques to promote.
We have very few automated techniques to promote transfer.
And none of them are used in modern deep learning.
The researchers did it.
Because there's no other explanation.
Gradient descent will not make you generalize well.
It will make you solve the problem.
It will not make you get new data.
you generalize in a good way.
Generalization means train on one thing that affects what you do on the other things.
So we know deep learning is really bad at this.
For example, we know that if you train on some new thing, it will often catastrophically interfere with all the old things that you knew.
So this is exactly bad generalization.
Right.
Generalization, as I said, is some kind of
influence of training on one state on other states.
And generalization is not necessarily good or bad.