Zach Furman

👤 Speaker

696 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The field's response was pragmatic.

480.023 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Scale the methods that work.

482.7 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Stop trying to understand why they work.

484.23 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

This attitude was partly earned.

487.33 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

For decades, hand-engineered systems encoding human knowledge about vision or language had lost to generic architectures trained on data.

490.035 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Human intuitions about what mattered kept being wrong.

497.989 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

But the pragmatic stance hardened into something stronger, a tacit assumption that trained networks were intrinsically opaque, that asking what the weights meant was a category error.

501.716 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

At first glance, this assumption seemed to have some theoretical basis.

511.293 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

If neural networks were best understood as just curve-fitting function approximators, then there was no obvious reason to expect the learned parameters to mean anything in particular.

516.262 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

They were solutions to an optimization problem, not representations.

526.02 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

And when researchers did look inside, they found dense matrices of floating point numbers with no obvious organization.

530.716 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

But a lens that predicts opacity makes the same prediction whether structure is absent or merely invisible.

537.865 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Some researchers kept looking.

544.053 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

PowerIT OWL 2022, train a small transformer on modular edition.

583.889 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Given two numbers, output their sum mod 113.

589.378 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

Only a fraction of the possible input pairs are used for training, say, 30%, with the rest held out for testing.

593.864 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

The network memorized the training pairs quickly, getting them all correct.

601.176 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

But on pairs it hasn't seen, it does no better than chance.

605.803 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

This is unsurprising.

609.909 View full episode →

LessWrong (Curated & Popular)

"Deep learning as program synthesis" by Zach Furman

With enough parameters, a network can simply store input output associations without extracting any rule.

611.632 View full episode →

← Previous Page 5 of 35 Next →

Report any issue