Zach Furman
๐ค SpeakerAppearances Over Time
Podcast Appearances
It is not detecting dog heads from scratch.
It is checking whether the right combination of simpler features is present in the right spatial arrangement.
We find the same pattern throughout the network.
A neuron that detects car windows is connected to neurons that detect rectangular shapes with reflective textures.
A neuron that detects car bodies is connected to neurons that detect smooth, curved surfaces.
And a neuron that detects cars as a whole is connected to neurons that detect wheels, windows, and car bodies arranged in the spatial configuration we would expect for a car.
Oler et al.
call these pathways circuits, and the term is meaningful.
The structure is genuinely circuit-like.
There are inputs, intermediate computations, and outputs connected by weighted edges that determine how features combine.
In their words, you can literally read meaningful algorithms off of the weights.
And the components are reused.
The same edge detectors that contribute to wheel detection also contribute to face detection, to building detection, to many other things.
The network has not built separate feature sets for each of the thousand categories it recognizes.
It has built a shared vocabulary of parts, edges, textures, curves, object components, etc., and combines them differently for different recognition tasks.
We might find this structure reminiscent of something.
A Boolean circuit is a composition of simple gates, each taking a few bits as input, outputting one bit, wired together to compute something complex.
A program is a composition of simple operations, each doing something small, arranged to accomplish something larger.
What Oller et al found in Inception V1 has the same shape.
Small computations, composed hierarchically, with components shared and reused across different pathways.