Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Rob Wiblin

πŸ‘€ Speaker
3881 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

The best internal Anthropic models scored near chance on this data set.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

And that's just distinguishing good and bad strategies that someone else came up with.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Presumably that would be worse still at coming up with their own approaches from scratch.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

This combination of strengths and weaknesses is repeated elsewhere.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Media staff couldn't be described as especially skeptical about AI capabilities in general, but this report definitely incidentally threw some cold water on the notion that AI agents are going to take all of our office jobs any minute now.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

META finds that while they can be extremely strong, where progress on a task is easy to measure accurately, they remain weak, sometimes surprisingly so, when it comes to judgment-heavy, messy, self-directed work.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

In one recent case, Opus 4.7 was given four days, internet access, and $5,000, and told to make as much money as possible.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Across four runs, it made precisely $0.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

In an experiment run by another organization, Andon Labs, an AI was put in charge of a retail shop in San Francisco, where, among other things, it ordered 1,000 toilet seat covers for the employee bathroom and then listed them as merchandise, while scheduling blunders forced the shop to be shut entirely for three days.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

That is why the report's conclusion is a narrow one.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

A model might be able to start a small rogue deployment if it inherits enough permissions and finds a monitoring gap.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

But keeping that deployment hidden from an investigation, that's the kind of thing that requires careful, coordinated, long-horizon operational security.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Exactly the kind of thing current agents are pretty damn bad at.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

The report observes, agents seemed much weaker in domains that made hill climbing difficult or risky, often making critical judgment errors that competent humans would have been unlikely to make.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Hill climbing here means that you can make lots of small attempts, get fast feedback about whether each attempt worked, and keep improving from there.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Current agents are much better when they can try something, get a clean signal about how it's going, and have another crack.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Where hill climbing isn't possible, it's much harder to train them, and they perform far worse in deployment as well.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

So this is our key source of protection.

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

How long can we hope that it will remain the case?

80,000 Hours Podcast
Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Well, the companies really, really want to make their AIs better at forming high-level strategic plans, improving them, noticing where they're failing, adjusting, and so on.