Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

BlueDot Narrated

AI Control: Improving Safety Despite Intentional Subversion

04 Jan 2025

Description

We’ve released a paper, AI Control: Improving Safety Despite Intentional Subversion. This paper explores techniques that prevent AI catastrophes even if AI instances are colluding to subvert the safety techniques. In this post:We summarize the paper;We compare our methodology to the methodology of other safety papers.Source:https://www.alignmentforum.org/posts/d9FJHawgkiMSPjagR/ai-control-improving-safety-despite-intentional-subversionNarrated for AI Safety Fundamentals by Perrin WalkerA podcast by BlueDot Impact.Learn more on the AI Safety Fundamentals website.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.