Zach Furman
๐ค SpeakerAppearances Over Time
Podcast Appearances
A natural starting point is to ask what individual neurons are doing.
Suppose we take a neuron somewhere in the network.
We can find images that make it activate strongly by either searching through a dataset or optimizing an input to maximize activation.
If we collect images that strongly activate a given neuron, do they have anything in common?
In early layers, they do, and the patterns we find are simple.
Neurons in the first few layers respond to edges at particular orientations, small patches of texture, transitions between colors.
Different neurons respond to different orientations or textures, but many are selective for something visually recognizable.
In later layers, the patterns we find become more complex.
Neurons respond to curves, corners, or repeating patterns.
Deeper still, neurons respond to things like eyes, wheels, or windows, object parts rather than geometric primitives.
This already suggests a hierarchy.
Simple features early, complex features later.
But the more striking finding is about how the complex features are built.
Ohler et al.
do not just visualize what neurons respond to.
They trace the connections between layers, examining the weights that connect one layer's neurons to the next, identifying which earlier features contribute to which later ones.
What they find is that later features are composed from earlier ones in interpretable ways.
There is, for instance, a neuron in Inception V1 that we identify as responding to dog heads.
If we trace its inputs by looking at which neurons from the previous layer connect to it with strong weights, we find it receives input from neurons that detect eyes, snout, fur, and tongue.
The dog head detector is built from the outputs of simpler detectors.