Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Scott Alexander

πŸ‘€ Speaker
4620 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

You were saying something like 10 years, and we're saying something like one year.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

But we are imagining this broad diffusion through the economy, lots of different experiments happening.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Right, so the crucial turning point is mid-2027 when they've basically fully automated the AI R&D process and they've got this corporation within a corporation

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

you know, the army of geniuses that are like autonomously doing all this research and they're continually being trained to improve their skills, blah, blah, blah, blah, blah.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And they discover concerning evidence that they are misaligned and that they're not actually perfectly loyal to the company and have all the goals that the company wanted them to have, but instead have like various misaligned goals that they must have developed in the course of training.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

This evidence, however, is very speculative and inconclusive.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

It's stuff like lie detectors going off a bunch.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

But maybe the lie detectors are false positives, you know?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

So they have some combination of evidence that's concerning, but not by itself a smoking gun.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then that's our branch point.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

So in one of these scenarios, they take that evidence very seriously.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

They basically roll back to an earlier version of the model that was a bit dumber and easier to control.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And they build up again from there, but with basically faithful chain of thought techniques so that they can like watch and see the misalignments.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then in the other branch of the scenario, they don't do that.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

They do some sort of shallow patch that makes the warning signs go away, and then they proceed.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And so what ends up happening is that in one branch, they do end up solving alignment and getting AIs that are actually loyal to them.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

It just takes a couple months longer.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then in the other branch, they sort of go wee and end up with AIs that seem to be perfectly aligned to them, but are super intelligent and misaligned and just pretending.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And then in both scenarios, there's then the race with China.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And there's this crazy arms buildup throughout the economy in 2028 as both sides, you know, rapidly try to industrialize basically.