18 - Concept Extrapolation with Stuart Armstrong - AXRP - the AI X-risk Research Podcast | Transcription & Insights

Description

Concept extrapolation is the idea of taking concepts an AI has about the world - say, "mass" or "does this picture contain a hot dog" - and extending them sensibly to situations where things are different - like learning that the world works via special relativity, or seeing a picture of a novel sausage-bread combination. For a while, Stuart Armstrong has been thinking about concept extrapolation and how it relates to AI alignment. In this episode, we discuss where his thoughts are at on this topic, what the relationship to AI alignment is, and what the open questions are. Topics we discuss, and timestamps: - 00:00:44 - What is concept extrapolation - 00:15:25 - When is concept extrapolation possible - 00:30:44 - A toy formalism - 00:37:25 - Uniqueness of extrapolations - 00:48:34 - Unity of concept extrapolation methods - 00:53:25 - Concept extrapolation and corrigibility - 00:59:51 - Is concept extrapolation possible? - 01:37:05 - Misunderstandings of Stuart's approach - 01:44:13 - Following Stuart's work The transcript: axrp.net/episode/2022/09/03/episode-18-concept-extrapolation-stuart-armstrong.html Stuart's startup, Aligned AI: aligned-ai.com Research we discuss: - The Concept Extrapolation sequence: alignmentforum.org/s/u9uawicHx7Ng7vwxA - The HappyFaces benchmark: github.com/alignedai/HappyFaces - Goal Misgeneralization in Deep Reinforcement Learning: arxiv.org/abs/2105.14111

Audio

Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes

🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Other recent transcribed episodes

Transcribed and ready to explore now

3ª PARTE | 17 DIC 2025 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

LVST 19 de diciembre de 2025

19 Dec 2025

La Venganza Será Terrible (oficial)

Trumps irre Milliarden-Fusion und Win-Win-Deal für Netflix

19 Dec 2025

Alles auf Aktien – Die täglichen Finanzen-News

PL Striker Transfer Grades

18 Dec 2025

ESPN FC

TNB Tech Minute: FTC Orders Instacart to Pay $60 Million Over Deceptive Practices

18 Dec 2025

WSJ Tech News Briefing

Hidden Gem Stocks We Love at the End of the Year

18 Dec 2025

Motley Fool Money

Comments

There are no comments yet.

Please log in to write the first comment.

AXRP - the AI X-risk Research Podcast

18 - Concept Extrapolation with Stuart Armstrong

This episode hasn't been transcribed yet

Other recent transcribed episodes

3ª PARTE | 17 DIC 2025 | EL PARTIDAZO DE COPE

LVST 19 de diciembre de 2025

Trumps irre Milliarden-Fusion und Win-Win-Deal für Netflix

PL Striker Transfer Grades

TNB Tech Minute: FTC Orders Instacart to Pay $60 Million Over Deceptive Practices

Hidden Gem Stocks We Love at the End of the Year

Sign in to Audioscrape

Share this moment