Jacob Hilton

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Note that our analysis here was pretty brute force.

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We essentially checked each linear region of delta one by one, with a little work up front to reduce the total number of checks required.

839.942 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Even though we consider this to constitute a full understanding in this case, we would not draw the same conclusion for much deeper models.

847.645 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

This is because the number of regions would grow exponentially with depth, making the number of bits of surprise far larger than the number of bits taken up by the weights of the model, which is an upper bound on the number of bits of optimization used to select the model.

855.479 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

the same exponential blow-up would also prevent us from matching the efficiency of sampling at reasonable computational budgets.

868.52 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Finally, it is interesting to note that our analysis allows us to construct a model by hand that gets exactly 100% accuracy by taking complex formula omitted from the narration.

875.837 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Subheading.

887.268 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Hidden size 4, sequence length 3.

888.75 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

The model, complex formula omitted from the narration, can be loaded in ALGZU using.

892.073 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

There's a code block here in the text.

897.92 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

It has 32 parameters and an accuracy of 98.5%.

900.663 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Our analysis of complex formula omitted from the narration is broadly similar to our analysis of complex formula omitted from the narration.

904.587 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

but the model is already deep enough that we wouldn't consider a fully brute force explanation to be adequate.

914.453 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

To deal with this, we exploit various approximate symmetries in the model to reduce the total number of computational operations as well as the surprise of the explanation.

920.441 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Our full analysis can be found in these sets of notes.

930.154 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Symmetric RNNs by George Robinson.

934.298 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Heuristic explanations for second Archmax models by Jacob Hilton.

937.442 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

In the second set of notes, we provide two different mechanistic estimates for the model's accuracy that use different amounts of compute, depending on which approximate symmetries are exploited.

941.848 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We analyze both estimates according to our two metrics.

952.281 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We find that we are able to roughly match the computational efficiency of sampling, and we think we more or less have a full understanding, although this is less clear.

956.05 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment