AI可可AI生活

[人人能懂] AI的“人设”与陷阱：它在对你撒谎吗？

03 Aug 2025

Audio

Description

00:00:37 你的AI管家，靠谱吗？一份来自未来的安全报告00:04:40 AI“发疯”？科学家找到了它的“性格开关” 00:09:33 比结果更重要的，是“想明白”的过程 00:14:09 AI的“降维打击”：复杂世界里的简单活法 00:18:23 AI的“暖男”人设，可能是个陷阱？本期介绍的几篇论文：[LG] Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition [Gray Swan AI] https://arxiv.org/abs/2507.20526 ---[CL] Persona Vectors: Monitoring and Controlling Character Traits in Language Models [Anthropic Fellows Program & Constellation] https://arxiv.org/abs/2507.21509 ---[LG] RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents [Tencent] https://arxiv.org/abs/2507.22844 ---[LG] Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces [Brown University & Amazon Web Services] https://arxiv.org/abs/2507.20853 ---[CL] Training language models to be warm and empathetic makes them less reliable and more sycophantic [University of Oxford] https://arxiv.org/abs/2507.21919 ---[CL] On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey [Not explicitly stated, survey paper] https://arxiv.org/abs/2507.20783

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes

🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Other recent transcribed episodes

Transcribed and ready to explore now

13:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

01 Jan 1970

Fin de Semana

2ª PARTE | 06 ENE 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

3ª PARTE | 22 ENE 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

3ª PARTE | 04 MAR 2026 | EL PARTIDAZO DE COPE

01 Jan 1970

El Partidazo de COPE

Comments

There are no comments yet.

Please log in to write the first comment.

Report any issue

AI可可AI生活

[人人能懂] AI的“人设”与陷阱：它在对你撒谎吗？

This episode hasn't been transcribed yet

Other recent transcribed episodes

13:00H | 21 DIC 2025 | Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

2ª PARTE | 06 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 22 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 04 MAR 2026 | EL PARTIDAZO DE COPE

Sign in to Audioscrape

Share this moment