Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Scott Alexander

πŸ‘€ Speaker
4620 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And you're also slapping on some sort of alignment training as well.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

We don't know what actual goals will end up inside the AIs and what the sort of internal structure of that will be like, what goals will be instrumental versus terminal.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

We have a couple different hypotheses and we like picked one for purposes of telling the story.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I'm happy to go into more detail if you want about like the mechanistic details of the particular hypothesis we picked or like the different alternative hypotheses that we didn't depict in the story that like also seem plausible to us.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Things like this do happen pretty frequently, so...

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

OpenAI just also had a paper about the hacking stuff where it's literally in the chain of thought, like, let's hack, you know?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And also, anecdotally, me and a bunch of friends have found that the models often seem to just double down on their BS.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

There's a mounting pile of evidence that at least some of the time they are just actually lying.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

They know that what they're doing is not what you wanted and they're doing it anyway.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I think there's a mounting pile of evidence that that does happen.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I think I'd also mention the homogeneity point.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Like, you know, any group of humans, even if they're all, like, exact same race and gender, is, like, going to be much more diverse than the army of AIs on the data center because they'll be mostly, like, literal copies of each other, you know?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And I think that goes for a lot.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Another thing I was going to mention is that, like, and our scenario doesn't really explore this.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

I think in our scenario they're more of, like, a monolith, but...

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Historically, a lot of crazy conquests happened from groups that were not at all monoliths.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

And I've been heavily influenced by reading the history of the conquistadors, which you may know about.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

But did you know that when Cortes took over Mexico, he had to pause halfway through, go back to the coast, and fight off a larger Spanish expedition that was sent to arrest him?

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

So, like, the Spanish were fighting each other in the middle of the conquest of Mexico.

Dwarkesh Podcast
2027 Intelligence Explosion: Month-by-Month Model β€” Scott Alexander & Daniel Kokotajlo

Similarly, in the conquest of Peru, Pizarro was replicating Cortes' strategy, which, by the way, was go get a meeting with the emperor and then kidnap the emperor and force him at sword point to...