Zach Furman
๐ค SpeakerAppearances Over Time
Podcast Appearances
This is also where the post will shift register.
The remaining sections sketch the structure of these problems and gesture at why certain mathematical frameworks, singular learning theory, algebraic geometry, etc.
might become relevant.
I won't develop these fully here, that requires machinery far beyond the scope of a single blog post, but I want to show why you'd need to leave sure at all, and what you might find out in open water.
Subheading The Representation Problem
The program synthesis hypothesis posits a relationship between two fundamentally different kinds of mathematical objects.
On one hand, we have programs.
A program is a discrete and symbolic object.
Its identity is defined by its compositional structure, a graph of distinct operations.
A small change to this structure, like flipping a comparison or replacing an addition with a subtraction, can create a completely different program with discontinuous, global changes in behavior.
The space of programs is discrete.
On the other hand, we have neural networks.
A neural network is defined by its parameter space, a continuous vector space of real valued weights.
The function a network computes is a smooth, or at least piecewise smooth, function of these parameters.
This smoothness is the essential property that allows for learning via gradient descent a process of infinitesimal steps along a continuous loss landscape.
This presents a seeming type mismatch.
How can a continuous process in a continuous parameter space give rise to a discrete, structured program?
The problem is deeper than it first appears.
To see why, we must first be precise about what we mean when we say a network has learned a program.
It cannot simply be about the input-output function the network computes.