Dwarkesh
๐ค SpeakerAppearances Over Time
Podcast Appearances
I'm sure if it were Daniel or Eli, they would have already made like five supplements on this.
But for me, I'm just kind of,
agnostic as to whether we get to that alignment solution, which in our scenario, I think we focus on mechanistic interpretability.
Once we can really understand the weights of an AI on a deep level, then we have a lot of alignment techniques open up to us.
I don't really have a great sense of whether we get that before or after the AI has become completely uncontrollable.
I mean, a big part of that relies on the things we're talking about.
How smart are the labs?
How carefully do they work on controlling the AI?
How long do they spend making sure the AI is actually under control and the alignment plan they gave us is actually correct rather than something they're trying to use to deceive us?
All of those things I'm completely agnostic on, but that leaves a pretty big chunk of probability space where we just do okay.
And I admit that my PDoom is literally just PDoom and not PDoom or oligarchy.
So...
that 80% of scenarios where we survive contains a lot of really bad things that I'm not happy about, but I do think that we have a pretty good chance of surviving.
We expect that as the
AI is, as the AI labs become more capable, they tell the government about this because they want government contracts, they want government support.
Eventually it reaches the point where the government is extremely impressed.
In our scenario, that starts with cyber warfare.
The government sees that these AIs are now as capable as the best human hackers that can be deployed at
huge, humongous scale.
So they become extremely interested and they discuss nationalizing the AI companies.