Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Zach Furman

๐Ÿ‘ค Speaker
696 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Stochastic gradient descent is not a neutral explorer.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The implicit biases of stochastic optimization when navigating a highly overparameterized loss landscape may create powerful channels that funnel the learning process toward a specific kind of simple, compositional solution.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Perhaps all roads do not lead to Rome, but the roads to Rome are the fastest.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The convergence could therefore be a clue about the nature of our learning dynamics themselves that they possess a strong, intrinsic preference for a particular class of solutions.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Viewed together, these observations suggest that the space of effective solutions for real-world tasks is far smaller and more structured than the space of possible models.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The phenomenon of convergence indicates that our models are finding this structure.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The bitter lesson suggests that our learning methods are general enough to do so.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The remaining questions point us toward the precise nature of that structure and the mechanisms by which our learning algorithms are so remarkably good at finding it.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Heading The Path Forward If you followed the argument this far, you might already sense where it becomes difficult to make precise.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The mechanistic interpretability evidence shows that networks can implement compositional algorithms.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The indirect evidence suggests this connects to why they generalize, scale, and converge.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

But Connexter is doing a lot of work.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

What would it actually mean to say that deep learning is some form of program synthesis?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Trying to answer this carefully leads to two problems.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The claim neural networks learn programs seems to require saying what a program even is in a space of continuous parameters.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

It also requires explaining how gradient descent could find such programs efficiently, given what we know about the intractability of program search.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

These are the kinds of problems where the difficulty itself is informative.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Each has a specific shape, what you need to think about, what a resolution would need to provide.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

I focus on them deliberately.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

That shape is what eventually pointed me towards specific mathematical tools I wouldn't have considered otherwise.