Jacob Hilton

In fact, by generalizing this idea, we can construct a model by hand that uses 22 hidden neurons to form all 10 leave-one-out maximum features and leverage these to achieve an accuracy of 99%.

1094.258 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Unfortunately, however, it is challenging to go much further than this.

1107.399 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We have exploited the approximate weight sparsity of 5 of the hidden neurons, but most of the remaining 11 hidden neurons are more densely connected.

1112.532 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We have produced a handcrafted model with high accuracy, but we have not produced a correspondence between most of hidden neurons of the trained model and the hidden neurons of the handcrafted model.

1121.802 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

We have used approximations in our analysis, but have not dealt with the approximation error, which gets increasingly significant as we consider more complex neurons.

1132.154 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Fundamentally, even though we have some understanding of the model, our explanation is incomplete because we not have not turned this understanding into an adequate mechanistic estimate of the model's accuracy.

1142.047 View full episode →

LessWrong (Curated & Popular)

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

Ultimately, to produce a mechanistic estimate for the accuracy of this model that is competitive with sampling or that constitutes a full understanding, we expect we would have to somehow combine this kind of feature analysis with elements of the brute force after exploiting symmetries approach used for the models.

1153.123 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment