Geoffrey Hinton
π€ SpeakerAppearances Over Time
Podcast Appearances
Well, there has to be someone saying what the right answer is.
That's called the supervisor, yes.
And the problem if you do it like that is there's a billion connection strengths.
Each of them has to be changed many times.
It's going to take like forever.
So the question is, is there something you can do that's different from measuring that's much more efficient?
And there is.
You can do something called computing.
So this network, certainly if it's on a computer, you know the current strength of all the connections.
So when you put in an image, there's nothing random about what... I mean, the connection strengths initially had random values, but when you put in an image, it's all deterministic what happens next.
The pixel intensities get multiplied by weights on connections to the first layer of neurons, their activities get multiplied by weights on connections to the second layer, and so on, and you get some activation levels of the output neurons.
So you could now ask the following question.
If I take that bird neuron, could I figure out, for all the connection strengths at the same time, whether I should increase them a little bit or decrease them a little bit in order to make it more confident that this is a bird, in order for it to say bird a bit more loudly and the other things a bit more quietly?
And you can do that with calculus.
You can send information backwards through the network
saying, how do I make this more likely to say bird next time?
And because you have a lot of physicists in the audience, I'm going to try and give you a physical intuition for this.
You put in bird, an image of a bird, and with the initial weights, the bird output neuron only gets very slightly active.
And so what you do now is you attach a piece of elastic
of 0 rest length, you attach a piece of elastic attaching the activity level of the bird output neuron to the value you want, which is, say, 1.