Dwarkesh

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

This was a mistake.

7081.553 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

We're going to take it out.

7082.634 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

So we kind of want more things like that to happen where people are looking at, like...

7084.015 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Here it was the prompt, but I think very soon it's going to be the spec where it's kind of more of an agent and it's understanding the spec in a deeper level.

7089.64 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

And just thinking about that and being, and if it says like, by the way, try to manipulate the government into doing this or that, then we know that something bad has happened and

7097.533 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

If it doesn't see that, then we can maybe trust it.

7107.308 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

This is actually part of our misalignment story, is that if the AI is sufficiently misaligned, then yes, we can tell it it has to follow the spec.

7223.999 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

But just as people with different views of the Constitution have managed to get it into a shape that probably the founders would not have recognized, so the AI will be able to say, well, the spec refers to the general welfare here.

7235.075 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

I think we agree.

7374.946 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

I think that's kind of why all of our policy prescriptions are things like more transparency, get more people involved, try to have lots of people working on this.

7375.887 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

I think our epistemic prediction is that it's hard to maintain classical liberalism as you go into these really difficult arms races in times of crisis.

7385.637 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

But I think that our policy prescription is let's try as hard as we can to make it happen.

7397.149 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Yeah, so I agree that the AIs are currently getting more reliable.

7429.267 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

I think there are two reasons why they might fail to do what you want, kind of reflecting how they're trained.

7432.913 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

One is that they're too stupid to understand their training.

7439.806 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

The other is that you were too stupid to train them correctly, and they understood what you were doing exactly, but you messed it up.

7442.57 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

So I think the first one is kind of what we're coming out of.

7448.521 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

So GPT-3, if you asked it, are bugs real, it would give this kind of hemming-hawing answer like, oh, we can never truly tell what is real.

7451.484 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Who knows?

7459.092 View full episode →

Dwarkesh Podcast

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Because it was trained kind of don't take difficult political positions and a lot of questions like is X real or things like is God real where you don't want it to really answer that.

7460.233 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment