Zach Furman
๐ค SpeakerAppearances Over Time
Podcast Appearances
The majority of the ideas in this post are not individually novel.
I see the core value proposition as synthesizing them together in one place.
The ideas I express here are, in my experience, very common among researchers at frontier labs, researchers in mechanistic interpretability, some researchers within science of deep learning, and others.
In particular, the core hypothesis that deep learning is performing some tractable version of Solomonov induction is not new, and has been written about many times.
However, I would not consider it to be a popular or accepted opinion within the machine learning field at large.
Personally, I have considered a version of this hypothesis for around three years.
With this post, I aim to share a more comprehensive synthesis of the evidence for this hypothesis as well as point to specific research directions for formalizing this idea.
Below is an incomplete list of what is known and published in various areas.
Existing comparisons between deep learning and program synthesis.
The ideas surrounding Solomonoff induction have been highly motivating for many early AGI-focused researchers.
Shane Legg, DeepMind co-founder, wrote his PhD thesis on Solomonoff induction.
John Shulam, OpenAI co-founder, discusses the connection to deep learning explicitly here.
Ilya Satskeva, OpenAI co-founder, has been giving talks on related ideas.
There are a handful of places one can find a hypothesized connection between deep learning and Solomonoff induction stated explicitly, though I do not believe any of these were the first to do so.
My personal experience is that such intuitions are fairly common among for example people working at frontier labs, even if they are not published in writing.
I am not sure who had the idea first, and suspect it was arrived at independently multiple times.
Feature Learning
It would not be accurate to say that the average ML researcher views deep learning as a complete black-box algorithm.
It is well accepted and uncontroversial that deep neural networks are able to extract features from the task which they use to perform well.
However, it is a step beyond to claim that these features are actually extracted and composed in some mechanistic fashion resembling a computer program.