本期的精华内容: R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement LearningR1-搜索器:通过强化学习激励LLMs的搜索能力通过强化学习教大型语言模型自己查资料,提升了回答知识密集型问题的能力。 Knowledge Updating? No More Model Editing! Just Selective Contextual Reasoning知识更新?不再编辑模型!只需选择性的情境推理提出了SCR框架,用外部知识作为“参考书”,让模型动态更新知识,不用改参数。 HieroLM: Egyptian Hieroglyph Recovery with Next Word Prediction Language Model埃及象形文字恢复与下一词预测语言模型把象形文字恢复变成猜词游戏,用语言模型帮考古学家恢复古文字。 Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation利用推理时间内的领域知识,与LLM 翻译:检索与生成发现翻译示例比字典更有效,外找的例子比自编的强,提升了专业领域的翻译质量。 Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models自我进化的偏好优化,以增强小型语言模型中的数学推理用SPHERE框架让小模型自学数学推理,缩小了和大模型的差距。完整推介:https://mp.weixin.qq.com/s/mvgMGFcwXPt0TczmCVMrlg
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
SpaceX Said to Pursue 2026 IPO
10 Dec 2025
Bloomberg Tech
Don’t Call It a Comeback
10 Dec 2025
Motley Fool Money
Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines
10 Dec 2025
The Daily AI Show
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
What it will take for AI to scale (energy, compute, talent)
10 Dec 2025
Azeem Azhar's Exponential View
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast