Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Rob Wiblin

๐Ÿ‘ค Speaker
1910 total appearances

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

it necessarily bites harder for getting the AIs to do alignment research than for getting the AIs to do anything else helpful.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because if they have it out for you, they don't necessarily want to help you shore up your civilization's defenses.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So if you're imagining trying to get a hardened, misaligned AI to help you with biodefense, if it's misaligned and it, for example, wants...

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

the option of threatening you with a bioweapon in its arsenal in the future, it would similarly have an incentive to do a bad job at that as it would to do a bad job at alignment research.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

In general, I think there's one big concern, which is, will the AIs that we're trying to use at that point in time have motivations that give them incentives to undermine the work we're trying to get them to do?

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

And I think they certainly would have incentives to undermine alignment research if they were misaligned.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

But I think they would also have incentives to undermine like efforts to make ourselves more rational and thoughtful, like AI for epistemics.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because if we're more rational and thoughtful, then maybe we'll realize they're probably misaligned and that would be bad for them.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

They would also have incentive to undermine our like DEFAC style, like defensive efforts, because that would make it harder for them to take over.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So I do think that if you are not worried about alignment at this early stage, everything becomes easier.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

It becomes an even more attractive strategy and path.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

But I think the canonical using AI for AI safety or using AI for defense plan does imagine that we're not sure at the beginning.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

that they're aligned.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

We may not be highly confident that they're extremely misaligned and fully power-seeking and looking to take over at every opportunity, but we're not imagining that we know with confidence we can trust them.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

So figuring out

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

how to create a setup where we use control techniques and alignment techniques and interpretability and whatever other tool at our disposal to get to the point where we feel good about relying on their outputs is a crucial step to figure out.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Because it either bottlenecks our progress, because we're checking on everything all the time and slowing things down, or it doesn't bottleneck our progress, but we hand the AIs the power to take over.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

Yeah, so one obvious one is just AI alignment.

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

How can we ensure that either these AIs that we're using to help us right now or future generations of AIs that they help us create and future generations that those AIs help us to create, how can we ensure that that whole chain is aligned

80,000 Hours Podcast
Every AI Company's Safety Plan is 'Use AI to Make AI Safe'. Is That Crazy? | Ajeya Cotra

motivated to help humans and is honest and is like basically doing what we say and steerable.