Zach Furman

"Deep learning as program synthesis" by Zach Furman

A network that has perfectly memorized a lookup table for modular addition computes the same function on a finite domain as a network that has learned the general trigonometric algorithm.

3287.952 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Yet we would want to say, emphatically, that they have learned different programs.

3298.081 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The program is not just the function.

3303.134 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

It is the underlying mechanism.

3305.64 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Thus the notion must depend on parameters, and not just functions, presenting a further conceptual barrier.

3308.187 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

To formalize the notion of mechanism, a natural first thought might be to partition the continuous parameter space into discrete regions.

3314.703 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

In this picture, all the parameter vectors within a region W subscript A would correspond to the same program A, while vectors in a different region W subscript B would correspond to program B. But this simple picture runs into a subtle and fatal problem.

3322.612 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The very smoothness that makes gradient descent possible works to dissolve any sharp boundaries between programs.

3338.029 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Imagine a continuous path in parameter space from a point, complex formula omitted from the narration, which clearly implements program A to a point, complex formula omitted from the narration, which clearly implements program B. Imagine, say, that A has some extra subroutine that B does not.

3344.588 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Because the map from parameters to the function is smooth, the network's behavior must change continuously along this path.

3363.647 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

At what exact point on this path did the mechanism switch from A to B?

3370.762 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Where did the new subroutine get added?

3374.351 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

There is no canonical place to draw a line.

3378.16 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

A sharp boundary would imply a discontinuity that the smoothness of the map from parameters to function seems to forbid.

3381.292 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

This is not so simple a problem, and it is worth spending some time thinking about how you might try to resolve it to appreciate that.

3388.22 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

What this suggests, then, is that for the program synthesis hypothesis to be a coherent scientific claim, it requires something that does not yet exist.

3395.688 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

A formal, geometric notion of a space of programs.

3404.758 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

This is a rather large gap to fill, and in some ways, this entire post is my long-winded way of justifying such an ambitious mathematical goal.

3408.697 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

I won't pretend that my collaborators and I don't have our own ideas about how to resolve this, but the mathematical sophistication required jumps substantially, and they would probably require their own full-length post to do justice.

3417.768 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

For now, I will just gesture at some clues which I think point in the right direction.

3430.223 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment