Dwarkesh

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And it basically comes down to points and who are the people leading this.

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And like, I feel like the company leaders have so far made slightly better noises about caring about alignment than the government leaders have.

6459.064 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Right.

6465.975 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

If I learn that Tulsi Gabbard has a less wrong alt with 10,000 karma, maybe I want the national security state.

6466.075 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

for various reasons and like there just is so much incentive pressure for them to like win and beat each other and so forth and so even though they have more of the relevant expertise like i also just don't trust them to to do the right things so daniel has already said that for this phase we're not making policy prescriptions in another phase we may make policy suggestions and one of the ones that daniel has talked about

6922.915 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

that makes a lot of sense to me, is to focus on things about transparency.

6944.752 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

So a regulation saying there have to be whistleblower protections.

6949.401 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

This is a big part of our scenario, is that a whistleblower comes out and says, the AIs are horribly misaligned and we're racing ahead anyway.

6953.81 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And then the government pays attention.

6963.128 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

or another form of transparency saying that every lab just has to publish their safety case.

6966.875 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

I'm not as sure about this one because I think they'll kind of fake it or they'll publish a made-for-public-consumption safety case that isn't their real safety case.

6972.222 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

But at least saying, like, here is some reason why you should trust us.

6981.194 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And then if all independent researchers say, no, actually, you should not trust them, then I don't know, they're embarrassed and maybe they try to do better.

6986 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Right.

6993.951 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

There's actually a really interesting foretaste of this.

7052.921 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

At some point...

7058.049 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Somebody asked Grok, like, who is the worst spreader of misinformation?

7061.375 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And it responded... I think it just refused to respond to Elon Musk.

7065.579 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Somebody kind of jailbroke it into telling it it's prompt, and it was, like, don't say anything bad about Elon.

7069.602 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And then there was enough of an outcry that the head of XAI said, actually, that's not consonant with our values.

7075.247 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment