Geoffrey Hinton
π€ SpeakerAppearances Over Time
Podcast Appearances
Okay, go ahead.
So this kind of learning...
where you back-propagate these forces and then change all the connection strengths so each neuron goes in the direction that the force is pulling it in, that's not reinforcement learning.
This is called supervised learning.
Reinforcement learning is something different.
So here, for example, we tell it what the right answer is.
If you've got a thousand categories,
and you showed a bird, you tell it that was a bird.
There you go.
In reinforcement learning, it makes a guess, and you tell it whether it got the answer right.
Okay, so in the mid-80s, we had the back propagation algorithm working, and it could do some neat things.
It could recognize handwritten digits better than nearly any other technique, but it couldn't deal with real images very well.
It could do quite well at speech recognition, but not substantially better than the other technologies.
And we didn't understand at the time why this wasn't the magic answer to everything.
And it turns out it was the magic answer to everything if you have enough data and enough compute power.
Wow.
So that's what was really missing in the 80s.
Okay.
Well.
There's a lot of elements to thinking, like people often think using images.