Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Gwern Branwen

👤 Person
855 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

Then it turns out that everywhere you go, compute and data and trial and error and serendipity just play enormous roles in how things actually happened.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

Once you understand that, then you understand why compute comes first.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You can't do trial and error and serendipity without it.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You can write down all these beautiful ideas, but you just can't test them out.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

So even a small difference in hyperparameters or a small choice of architecture can make a huge difference to the results.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

But when you can only do a few instances, you would typically end up finding that it just doesn't work, or maybe you would give up and you would go away and do something else.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

Whereas if you had more compute power, you can just keep trying.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

Eventually you hit something that works great.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

And once you have a working solution, you can kind of simplify it and improve it and figure out why it worked and get a nice robust solution that would work no matter what you did to it.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

But until then you're stuck and you're just kind of like flailing around in this regime where nothing works.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You know, you can have this horrible experience now where you go back through the old deep learning literature and see all these sorts of contemporary ideas that people had back then, which were completely correct, but they didn't have the compute to train what you know would have worked.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You know, and it's tremendously tragic, right?

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You go back and you can look at things like ResNet's being published back in 1988 instead of 2015.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

And it would have worked.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

It did work, but it's such a small scale that it was irrelevant.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

You couldn't use it for anything real, and it just got forgotten.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

So you have to wait until 2015 for ResNets to actually come along and be a revolution in deep learning.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

So that's kind of the double bias of why you would believe that scaling was not going to work.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

Because you didn't notice the results that were key in retrospect, like the big GAN scaling to 300 million images.

Dwarkesh Podcast
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

I think there's still people today who would tell you with a straight face that GANs can't scale past millions of images.