Rob Wiblin

👤 Speaker

3881 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1

Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Companies basically, they mostly don't think that this sort of collusion is an issue yet.

603.92 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

So they're not doing that kind of testing at scale, not so far.

607.673 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But in future, they could start if collusion became a live concern.

611.306 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Anyway, what was the red teamers bottom line on opportunity?

615.31 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

In short, anthropics monitoring isn't sufficient to have a high probability of picking up a deeply misaligned Claude taking rogue actions in any specific case.

618.737 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But to get anything meaningful done, you have to take a lot of actions.

626.934 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And soon enough, the model would get busted.

629.98 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

David estimates within a few days or at most weeks.

631.824 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Before it was caught, the model might be able to take some small early steps like inserting exploits or vulnerabilities that make larger sabotage actions easier later on.

635.493 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But once it was caught, these would, for the most part, be identified and rolled back.

644.227 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

So there's a window of opportunity, but it's not a wide open window.

648.574 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And just to be clear, you shouldn't read this as suggesting Anthropic is somehow uniquely vulnerable here.

653.062 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

They're just the first ones to invite an external analyst in to stress test their systems.

657.288 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Okay, that's opportunity.

662.425 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

How about means?

664.167 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Are the models good enough to pull off an attack like this?

665.448 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Well, the models are good enough to grab roughly the amount of compute the human user can access, set up machines to run agents using that compute, and hide that from casual user oversight.

668.271 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

That's why the report concludes that internal agents have the means to start minimal rogue deployments at all companies.

677.04 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But how about expanding those deployments, spreading across computers, avoiding detection, or even evading shutdown once detected?

682.726 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

I would have thought AR models would find that kind of thing pretty easy.

688.953 View full episode →

← Previous Page 28 of 195 Next →

Report any issue

Rob Wiblin

Voice Profile Active

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment