Zach Furman
๐ค SpeakerAppearances Over Time
Podcast Appearances
So what is this structure doing?
Following it through the network reveals something unexpected.
The network has learned an algorithm for modular addition based on trigonometry.
There's an image here.
The algorithm exploits how angles add.
If you represent a number as a position on a circle, then adding two numbers corresponds to adding their angles.
The network's embedding layer does this representation.
Its middle layers then combine the sine and cosine values of the two inputs using trigonometric identities.
These operations are implemented in the weights of the attention and MLP layers.
One can read off coefficients that correspond to the terms in these identities.
Finally, the network needs to convert back to a discrete answer.
It does this by checking for each possible output, complex formula omitted from the narration.
How well, complex formula omitted from the narration.
Matches the sum it computed.
Specifically, the logit for output, complex formula omitted from the narration.
Depends on, complex formula omitted from the narration.
This quantity is maximized when, complex formula omitted from the narration.
Equals, complex formula omitted from the narration.
The correct answer.
At that point the cosines are different frequencies all equal 1 and add constructively.