Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI可可AI生活

[人人能懂] AI的“人设”与陷阱:它在对你撒谎吗?

03 Aug 2025

Description

00:00:37 你的AI管家,靠谱吗?一份来自未来的安全报告00:04:40 AI“发疯”?科学家找到了它的“性格开关” 00:09:33 比结果更重要的,是“想明白”的过程 00:14:09 AI的“降维打击”:复杂世界里的简单活法 00:18:23 AI的“暖男”人设,可能是个陷阱?本期介绍的几篇论文:[LG] Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition  [Gray Swan AI]  https://arxiv.org/abs/2507.20526  ---[CL] Persona Vectors: Monitoring and Controlling Character Traits in Language Models  [Anthropic Fellows Program & Constellation]  https://arxiv.org/abs/2507.21509  ---[LG] RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents  [Tencent]  https://arxiv.org/abs/2507.22844  ---[LG] Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces  [Brown University & Amazon Web Services]  https://arxiv.org/abs/2507.20853  ---[CL] Training language models to be warm and empathetic makes them less reliable and more sycophantic  [University of Oxford]  https://arxiv.org/abs/2507.21919  ---[CL] On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey  [Not explicitly stated, survey paper]  https://arxiv.org/abs/2507.20783  

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.