Jacob Hilton
๐ค SpeakerAppearances Over Time
Podcast Appearances
Note that our analysis here was pretty brute force.
We essentially checked each linear region of delta one by one, with a little work up front to reduce the total number of checks required.
Even though we consider this to constitute a full understanding in this case, we would not draw the same conclusion for much deeper models.
This is because the number of regions would grow exponentially with depth, making the number of bits of surprise far larger than the number of bits taken up by the weights of the model, which is an upper bound on the number of bits of optimization used to select the model.
the same exponential blow-up would also prevent us from matching the efficiency of sampling at reasonable computational budgets.
Finally, it is interesting to note that our analysis allows us to construct a model by hand that gets exactly 100% accuracy by taking complex formula omitted from the narration.
Subheading.
Hidden size 4, sequence length 3.
The model, complex formula omitted from the narration, can be loaded in ALGZU using.
There's a code block here in the text.
It has 32 parameters and an accuracy of 98.5%.
Our analysis of complex formula omitted from the narration is broadly similar to our analysis of complex formula omitted from the narration.
but the model is already deep enough that we wouldn't consider a fully brute force explanation to be adequate.
To deal with this, we exploit various approximate symmetries in the model to reduce the total number of computational operations as well as the surprise of the explanation.
Our full analysis can be found in these sets of notes.
Symmetric RNNs by George Robinson.
Heuristic explanations for second Archmax models by Jacob Hilton.
In the second set of notes, we provide two different mechanistic estimates for the model's accuracy that use different amounts of compute, depending on which approximate symmetries are exploited.
We analyze both estimates according to our two metrics.
We find that we are able to roughly match the computational efficiency of sampling, and we think we more or less have a full understanding, although this is less clear.