Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Zach Furman

๐Ÿ‘ค Speaker
696 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The parameter function map is precisely the same object responsible for the mystery discussed in the representation section.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This is not an airtight argument.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

It depends on the empirical question of whether one can ignore or treat as second-order effects other optimization details besides the loss function and whether the hand-wave-y argument for the importance of the parameter function map over the function space loss is solid.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Even if one assumes this argument is valid, we have merely located the mystery, not resolved it.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The question remains, what properties of the parameter function map make targets learnable?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

At this point the reasoning becomes more speculative, but I will sketch some ideas.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The representation section concerned what structure the map encodes at each point in parameter space.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Learnability appears to depend on something further.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

the structure of paths between points.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Convexity of function space loss implies that paths which are sufficiently straight in function space are barrier-free, roughly, if the endpoint is lower loss, the entire path is downhill.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

So the question becomes, which function space paths does the map provide?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

the same architectures successfully learn many diverse real-world targets.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Whatever property of the map enables this, it must be relatively universal, not tailored to specific targets.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This naturally leads us to ask, in what cases does the parameter function map provide direct enough paths to targets with certain structure and characterizing what direct enough means?

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

This connects back to the representation problem.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

If the map encodes some notion of program structure, then path structure in parameter space induces relationships between programs, which programs are adjacent, which are reachable from which.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

The representation section asks how programs are encoded as points.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

Learnability asks how they are connected as paths.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

These are different aspects of the same object.

LessWrong (Curated & Popular)
"Deep learning as program synthesis" by Zach Furman

One hypothesis.