Zach Furman

"Deep learning as program synthesis" by Zach Furman

Stochastic gradient descent is not a neutral explorer.

"Deep learning as program synthesis" by Zach Furman

The implicit biases of stochastic optimization when navigating a highly overparameterized loss landscape may create powerful channels that funnel the learning process toward a specific kind of simple, compositional solution.

3051.68 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Perhaps all roads do not lead to Rome, but the roads to Rome are the fastest.

3064.824 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The convergence could therefore be a clue about the nature of our learning dynamics themselves that they possess a strong, intrinsic preference for a particular class of solutions.

3069.695 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Viewed together, these observations suggest that the space of effective solutions for real-world tasks is far smaller and more structured than the space of possible models.

3079.508 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The phenomenon of convergence indicates that our models are finding this structure.

3088.8 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The bitter lesson suggests that our learning methods are general enough to do so.

3093.505 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The remaining questions point us toward the precise nature of that structure and the mechanisms by which our learning algorithms are so remarkably good at finding it.

3098.152 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Heading The Path Forward If you followed the argument this far, you might already sense where it becomes difficult to make precise.

3107.065 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The mechanistic interpretability evidence shows that networks can implement compositional algorithms.

3116.319 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The indirect evidence suggests this connects to why they generalize, scale, and converge.

3122.277 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

But Connexter is doing a lot of work.

3128.246 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

What would it actually mean to say that deep learning is some form of program synthesis?

3131.331 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Trying to answer this carefully leads to two problems.

3136.218 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The claim neural networks learn programs seems to require saying what a program even is in a space of continuous parameters.

3139.964 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

It also requires explaining how gradient descent could find such programs efficiently, given what we know about the intractability of program search.

3147.235 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

These are the kinds of problems where the difficulty itself is informative.

3156.285 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Each has a specific shape, what you need to think about, what a resolution would need to provide.

3160.81 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

I focus on them deliberately.

3166.596 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

That shape is what eventually pointed me towards specific mathematical tools I wouldn't have considered otherwise.

3168.779 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment