Zach Furman

Whatever caused it to keep searching, to eventually settle on the generalizing algorithm instead, it wasn't that the generalizing algorithm fit the data better.

924.574 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

It was something else, some property of the learning process that favored one kind of solution over another.

933.483 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The generalizing algorithm is, in a sense, simpler.

939.58 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

It compresses what would otherwise be thousands of stored associations into a compact procedure.

943.785 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Whether that's the right way to think about what happened here, whether simplicity is really what the training process favors, is not obvious.

949.412 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

But something made the network prefer a mechanistic solution that generalized over one that didn't, and it wasn't the training data alone.

957.221 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Subheading.

964.99 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Vision circuits.

966.431 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Grokking is a controlled setting, a small network, a simple task, designed to be fully interpretable.

987.24 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Does the same kind of structure appear in realistic models solving realistic problems?

993.884 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Ola Ital, 2020, study Inception V1, an image classification network trained on ImageNet, a dataset of over a million photographs labeled with object categories.

999.217 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The network takes in an image and outputs a probability distribution over a thousand possible labels, car, dog, coffee mug, and so on.

1010.337 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Can we understand this more realistic setting?

1019.15 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment