起業の履歴書

AIは制御できるのか？【The Urgency of Interpretability】

07 May 2025

Audio

Description

今回はAnthropic CEOのDario Amodeiが書いた『The Urgency of Interpretability』について話しました。【目次】(00:00) 今回のテーマ「AIをどうやって制御するのか」(00:21) 今回の参考文献『The Urgency of Interpretability』(00:41) AIはブラックボックスである(02:18) 「AIの内部構造の解明」で解決する3つの問題(04:15) モデル解明(Model Interpretability)研究の今(09:39) Superalignmentの手法: AIモデルの脳スキャン(14:00) 我々人類のやるべきこと - 1.AI安全性研究にもっと取り組む(16:06) 我々人類のやるべきこと - 2.AI安全性についてのルールを作る(16:42) 我々人類のやるべきこと - 3.AI開発においてアメリカのリードを広げる(18:43) 我々人類のやるべきこと - まとめ(19:02) まとめ(19:51) AnthropicのAlignment研究【参考文献】- https://www.darioamodei.com/post/the-urgency-of-interpretability- https://openai.com/index/language-models-can-explain-neurons-in-language-models/- https://transformer-circuits.pub/2025/attribution-graphs/biology.html- https://www.anthropic.com/research/exploring-model-welfare- https://youtu.be/pyXouxa0WnY?si=19FEKkB4Nt-MNS1U- https://techstartups.com/2025/04/18/anthropic-backs-goodfire-in-50m-series-a-to-decode-ai-models-marking-first-ever-startup-investment/【起業の履歴書について】起業の履歴書は「偉大な企業を作り上げた起業家やその企業の歴史や哲学 📚」を掘り下げたり「スタートアップのテクノロジートレンド 💻」について話すチャンネルです！＜ホスト＞・East Ventures 村上雄也 - https://twitter.com/yu8muraka3・伊藤工太郎 - https://twitter.com/etaroid＜Podcast＞Spotify - https://open.spotify.com/show/5ryodBEEOn66Wk7H2Sl8zFApple - https://podcasts.apple.com/jp/podcast/kigyo-no-rirekisho/id1767313170【お問い合わせ】起業の相談やお仕事の依頼は「下記お問い合わせフォーム 📋」または「上記XアカウントのDM 📮」にお送りください！https://forms.gle/hYZojSEgvqis8Pys5取り扱って欲しいテーマのリクエストなどもお待ちしております！

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes

🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Other recent transcribed episodes

Transcribed and ready to explore now

13:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

2ª PARTE | 06 ENE 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

3ª PARTE | 22 ENE 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

3ª PARTE | 04 MAR 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

Comments

There are no comments yet.

Please log in to write the first comment.

Report any issue

起業の履歴書

AIは制御できるのか？【The Urgency of Interpretability】

This episode hasn't been transcribed yet

Other recent transcribed episodes

13:00H | 21 DIC 2025 | Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

2ª PARTE | 06 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 22 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 04 MAR 2026 | EL PARTIDAZO DE COPE

Sign in to Audioscrape

Share this moment