Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

Doom Debates

We Found AI's Preferences — What David Shapiro MISSED in this bombshell Center for AI Safety paper

21 Feb 2025

Description

The Center for AI Safety just dropped a fascinating paper — they discovered that today’s AIs like GPT-4 and Claude have preferences! As in, coherent utility functions. We knew this was inevitable, but we didn’t know it was already happening.This episode has two parts:In Part I (48 minutes), I react to David Shapiro’s coverage of the paper and push back on many of his points.In Part II (60 minutes), I explain the paper myself.00:00 Episode Introduction05:25 PART I: REACTING TO DAVID SHAPIRO10:06 Critique of David Shapiro's Analysis19:19 Reproducing the Experiment35:50 David's Definition of Coherence37:14 Does AI have “Temporal Urgency”?40:32 Universal Values and AI Alignment49:13 PART II: EXPLAINING THE PAPER51:37 How The Experiment Works01:11:33 Instrumental Values and Coherence in AI01:13:04 Exchange Rates and AI Biases01:17:10 Temporal Discounting in AI Models01:19:55 Power Seeking, Fitness Maximization, and Corrigibility01:20:20 Utility Control and Bias Mitigation01:21:17 Implicit Association Test01:28:01 Emailing with the Paper’s Authors01:43:23 My TakeawayShow NotesDavid’s source video: https://www.youtube.com/watch?v=XGu6ejtRz-0The research paper: http://emergent-values.aiWatch the Lethal Intelligence Guide, the ultimate introduction to AI x-risk! https://www.youtube.com/@lethal-intelligencePauseAI, the volunteer organization I’m part of: https://pauseai.infoJoin the PauseAI Discord — https://discord.gg/2XXWXvErfA — and say hi to me in the #doom-debates-podcast channel!Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.Support the mission by subscribing to my Substack at https://doomdebates.com and to https://youtube.com/@DoomDebates Get full access to Doom Debates at lironshapira.substack.com/subscribe

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.