Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Jacob Hilton

๐Ÿ‘ค Speaker
204 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Note that our analysis here was pretty brute force.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We essentially checked each linear region of delta one by one, with a little work up front to reduce the total number of checks required.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Even though we consider this to constitute a full understanding in this case, we would not draw the same conclusion for much deeper models.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

This is because the number of regions would grow exponentially with depth, making the number of bits of surprise far larger than the number of bits taken up by the weights of the model, which is an upper bound on the number of bits of optimization used to select the model.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

the same exponential blow-up would also prevent us from matching the efficiency of sampling at reasonable computational budgets.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Finally, it is interesting to note that our analysis allows us to construct a model by hand that gets exactly 100% accuracy by taking complex formula omitted from the narration.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Subheading.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Hidden size 4, sequence length 3.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

The model, complex formula omitted from the narration, can be loaded in ALGZU using.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

There's a code block here in the text.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

It has 32 parameters and an accuracy of 98.5%.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Our analysis of complex formula omitted from the narration is broadly similar to our analysis of complex formula omitted from the narration.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

but the model is already deep enough that we wouldn't consider a fully brute force explanation to be adequate.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

To deal with this, we exploit various approximate symmetries in the model to reduce the total number of computational operations as well as the surprise of the explanation.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Our full analysis can be found in these sets of notes.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Symmetric RNNs by George Robinson.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Heuristic explanations for second Archmax models by Jacob Hilton.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

In the second set of notes, we provide two different mechanistic estimates for the model's accuracy that use different amounts of compute, depending on which approximate symmetries are exploited.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We analyze both estimates according to our two metrics.

LessWrong (Curated & Popular)
"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We find that we are able to roughly match the computational efficiency of sampling, and we think we more or less have a full understanding, although this is less clear.