Jacob Hilton
๐ค SpeakerAppearances Over Time
Podcast Appearances
Steps.
Neuron 6 is approximately.
Complex formula omitted from the narration.
Neuron 7 is approximately.
Complex formula omitted from the narration.
We can keep going and add in neuron 1 to the subcircuit.
Complex formula omitted from the narration.
Hence, after unrolling the RNN4.
Complex formula omitted from the narration.
Steps.
Neuron 1 is approximately.
Complex formula omitted from the narration.
forming another leave-one-out maximum feature minus the most recent input.
In fact, by generalizing this idea, we can construct a model by hand that uses 22 hidden neurons to form all 10 leave-one-out maximum features and leverage these to achieve an accuracy of 99%.
Unfortunately, however, it is challenging to go much further than this.
We have exploited the approximate weight sparsity of 5 of the hidden neurons, but most of the remaining 11 hidden neurons are more densely connected.
We have produced a handcrafted model with high accuracy, but we have not produced a correspondence between most of hidden neurons of the trained model and the hidden neurons of the handcrafted model.
We have used approximations in our analysis, but have not dealt with the approximation error, which gets increasingly significant as we consider more complex neurons.
Fundamentally, even though we have some understanding of the model, our explanation is incomplete because we not have not turned this understanding into an adequate mechanistic estimate of the model's accuracy.
Ultimately, to produce a mechanistic estimate for the accuracy of this model that is competitive with sampling or that constitutes a full understanding, we expect we would have to somehow combine this kind of feature analysis with elements of the brute force after exploiting symmetries approach used for the models.