Rob Wiblin

👤 Speaker

3881 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1

Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

You'll recall the impressive cyber capabilities of Mythos Preview that I mentioned a month or two ago and everyone's been talking about.

692.677 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Its ability to find new vulnerabilities in operating systems, write exploits, progressively break into computer networks and escape the sandboxes Anthropic tried to confine it to.

698.284 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

META doesn't actually say if Mythos Preview was one of the models they tested, but that's the kind of level models are operating at now.

706.754 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

MITA also found that the most capable models they looked at for this research were able to successfully complete almost all of the software engineering tasks that they've come up with, even ones that would take a human professional 10 or 20 hours.

713.422 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

So it's not crazy to guess that an AI that starts out already inside a company's computer networks would have an easy time spreading, gaining access to more and more things, and keeping a low profile.

724.838 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But MITA argues that would be quite mistaken.

735.332 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

The trouble is this.

737.916 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

to spread from computer to computer and avoid shutdown for weeks.

739.358 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

A rogue agent needs to get a long chain of things right without a single catastrophic mistake.

742.624 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

It needs to understand what's monitoring it, avoid writing incriminating thoughts in its scratch pad, keep credentials alive, create backup instances, and communicate with them without leaving suspicious traces.

747.513 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And it wouldn't just be dodging one watchdog.

757.831 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

It would be dodging cloud logs, billing records, and network monitoring.

759.514 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And if anyone noticed something odd, it would then need to survive an investigation where transcripts, shell histories, cloud records, credentials, etc., they all have to tell some consistent story.

763.321 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And one loose thread could bring the whole thing crashing down.

772.175 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

As long as the company we're talking about cares about this and is on the ball, all of that is absolutely no simple task.

775.2 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And according to the report, when trying to do this kind of thing, agents made obvious mistakes.

781.049 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

For instance, encoding suspicious commands in ways that made them stand out more

785.596 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

generating justifications that didn't hold together, or verifying their own sabotage in ways that drew monitor attention.

789.342 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

They also found that they chose bad high-level strategies.

795.31 View full episode →

80,000 Hours Podcast

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Mita writes, our embedded red teamer built a data set of 16 effective and 16 ineffective attack strategies for subverting Anthropic's monitoring system.

798.574 View full episode →

← Previous Page 29 of 195 Next →

Report any issue

Rob Wiblin

Voice Profile Active

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment