Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Zach Furman

๐Ÿ‘ค Speaker
696 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

You don't do something stupid like condition them badly numerically.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

And they wanna learn.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

They'll do it.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Dario Amodei.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

End quote.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

I remember when I trained my first neural network, there was something almost miraculous about it.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

It could solve problems which I had absolutely no idea how to code myself, for example how to distinguish a cat from a dog, and in a completely opaque way such that even after it had solved the problem I had no better picture for how to solve the problem myself than I did beforehand.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Moreover, it was remarkably resilient, despite obvious problems with the optimizer, or bugs in the code, or bad training data, unlike any other engineered system I had ever built, almost reminiscent of something biological in its robustness.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

My impression is that this sense of magic is a common, if often unspoken, experience among practitioners.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Many simply learn to accept the mystery and get on with the work.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

But there is nothing virtuous about confusion, it just suggests that your understanding is incomplete, that you are ignorant of the real mechanisms underlying the phenomenon.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Our practical success with deep learning has outpaced our theoretical understanding.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This has led to a proliferation of explanations that often feel ad hoc and local, tailor-made to account for a specific empirical finding without connecting to other observations or any larger framework.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

For instance, the theory of double descent provides a narrative for the U-shaped test-loss curve, but it is a self-contained story.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

It does not, for example, share a conceptual foundation with the theories we have for how induction heads form in transformers.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Each new discovery seems to require a new, bespoke theory.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

One naturally worries that we are juggling epicycles.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This sense of theoretical fragility is compounded by a second problem.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

For any single one of these phenomena, we often lack consensus, entertaining multiple, competing hypotheses.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Consider the core question of why neural networks generalize.