Rob Wiblin

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

it necessarily bites harder for getting the AIs to do alignment research than for getting the AIs to do anything else helpful.

3253.666 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because if they have it out for you, they don't necessarily want to help you shore up your civilization's defenses.

3261.677 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So if you're imagining trying to get a hardened, misaligned AI to help you with biodefense, if it's misaligned and it, for example, wants...

3268.247 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

the option of threatening you with a bioweapon in its arsenal in the future, it would similarly have an incentive to do a bad job at that as it would to do a bad job at alignment research.

3277.46 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

In general, I think there's one big concern, which is, will the AIs that we're trying to use at that point in time have motivations that give them incentives to undermine the work we're trying to get them to do?

3289.319 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

And I think they certainly would have incentives to undermine alignment research if they were misaligned.

3304.443 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

But I think they would also have incentives to undermine like efforts to make ourselves more rational and thoughtful, like AI for epistemics.

3310.43 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because if we're more rational and thoughtful, then maybe we'll realize they're probably misaligned and that would be bad for them.

3318.498 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

They would also have incentive to undermine our like DEFAC style, like defensive efforts, because that would make it harder for them to take over.

3323.984 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So I do think that if you are not worried about alignment at this early stage, everything becomes easier.

3366.965 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

It becomes an even more attractive strategy and path.

3373.193 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

But I think the canonical using AI for AI safety or using AI for defense plan does imagine that we're not sure at the beginning.

3377.559 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

that they're aligned.

3387.653 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

We may not be highly confident that they're extremely misaligned and fully power-seeking and looking to take over at every opportunity, but we're not imagining that we know with confidence we can trust them.

3389.356 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So figuring out

3402.699 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

how to create a setup where we use control techniques and alignment techniques and interpretability and whatever other tool at our disposal to get to the point where we feel good about relying on their outputs is a crucial step to figure out.

3404.242 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because it either bottlenecks our progress, because we're checking on everything all the time and slowing things down, or it doesn't bottleneck our progress, but we hand the AIs the power to take over.

3419.666 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Yeah, so one obvious one is just AI alignment.

3438.469 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

How can we ensure that either these AIs that we're using to help us right now or future generations of AIs that they help us create and future generations that those AIs help us to create, how can we ensure that that whole chain is aligned

3443.536 View full episode →

80,000 Hours Podcast

Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

motivated to help humans and is honest and is like basically doing what we say and steerable.

3456.453 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment