Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Zach Furman

๐Ÿ‘ค Speaker
696 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

A network that has perfectly memorized a lookup table for modular addition computes the same function on a finite domain as a network that has learned the general trigonometric algorithm.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Yet we would want to say, emphatically, that they have learned different programs.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The program is not just the function.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

It is the underlying mechanism.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Thus the notion must depend on parameters, and not just functions, presenting a further conceptual barrier.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

To formalize the notion of mechanism, a natural first thought might be to partition the continuous parameter space into discrete regions.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

In this picture, all the parameter vectors within a region W subscript A would correspond to the same program A, while vectors in a different region W subscript B would correspond to program B. But this simple picture runs into a subtle and fatal problem.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The very smoothness that makes gradient descent possible works to dissolve any sharp boundaries between programs.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Imagine a continuous path in parameter space from a point, complex formula omitted from the narration, which clearly implements program A to a point, complex formula omitted from the narration, which clearly implements program B. Imagine, say, that A has some extra subroutine that B does not.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Because the map from parameters to the function is smooth, the network's behavior must change continuously along this path.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

At what exact point on this path did the mechanism switch from A to B?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Where did the new subroutine get added?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

There is no canonical place to draw a line.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

A sharp boundary would imply a discontinuity that the smoothness of the map from parameters to function seems to forbid.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This is not so simple a problem, and it is worth spending some time thinking about how you might try to resolve it to appreciate that.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

What this suggests, then, is that for the program synthesis hypothesis to be a coherent scientific claim, it requires something that does not yet exist.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

A formal, geometric notion of a space of programs.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This is a rather large gap to fill, and in some ways, this entire post is my long-winded way of justifying such an ambitious mathematical goal.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

I won't pretend that my collaborators and I don't have our own ideas about how to resolve this, but the mathematical sophistication required jumps substantially, and they would probably require their own full-length post to do justice.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

For now, I will just gesture at some clues which I think point in the right direction.