AI可可AI生活
Episodes
AI前沿:从梯度下降模拟提示到数据效能的革命
28 Jun 2025
Contributed by Lukas
[CL] Can Gradient Descent Simulate Prompting?[MIT CSAIL]https://arxiv.org/abs/2506.20989---[CL] Potemkin Understanding in Large Language Models[MIT &a...
[感悟]两把尺子,量出两种人生
27 Jun 2025
Contributed by Lukas
决定你人生品质的,不是世界怎么看你,而是你如何看待你自己的世界。
给大象绣花,换个针法就简单了?
27 Jun 2025
Contributed by Lukas
[LG] Orthogonal Finetuning Made Scalable[Max Planck Institute for Intelligent Systems & University of Cambridge]arxiv.org
AI公司的“裁员”秘密:谁是真正的骨干?
27 Jun 2025
Contributed by Lukas
[LG] Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units[University Medical Center Eppendorf &...
AI换脑术:如何让模型一键学会新本事?
27 Jun 2025
Contributed by Lukas
[LG] Command-V: Pasting LLM Behaviors via Activation Profiles[CMU]https://arxiv.org/abs/2506.19140
AI的秘密:打碎了还能认出来?
27 Jun 2025
Contributed by Lukas
[CL] Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations[University of Washington]https://arxiv.org/abs/2506.19004
高手过招:接力赛为何胜过群英会?
27 Jun 2025
Contributed by Lukas
[LG] Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models[Northwestern University]https://arxiv.org/abs/2506.18945
AI前沿:从代码生成到自动化科研
27 Jun 2025
Contributed by Lukas
[CL] DiffuCoder:Understanding and Improving Masked Diffusion Models for Code Generation[Apple]https://arxiv.org/abs/2506.20639---[LG] Language Model...
[感悟]你的“监狱”有多大,世界就有多大
26 Jun 2025
Contributed by Lukas
这个世界上最大的“监狱”,不是用钢筋水泥建造的,而是我们自己思维的产物。
AI界的“好学生”:不仅会答题,还会写参考文献
26 Jun 2025
Contributed by Lukas
[CL] Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models[Duke University & Meta]https://arxiv.org/abs/2506.17585
聪明人的“笨功夫”,和“笨人”的聪明劲
26 Jun 2025
Contributed by Lukas
[LG] In-Context Learning Strategies Emerge Rationally[Stanford University & Harvard University]https://arxiv.org/abs/2506.17859
AI变聪明的秘密:请个好管家,而不是多招人
26 Jun 2025
Contributed by Lukas
[LG] Routing Mamba: Scaling State Space Models with Mixture-of-Experts Projection[Microsoft]https://arxiv.org/abs/2506.18145
AI变聪明的秘密:刷新“脑回路”,而不是扩容“硬盘”
26 Jun 2025
Contributed by Lukas
[LG] The 4th Dimension for Scaling Model Size[University of Illinois at Urbana-Champaign & University of Toronto]https://arxiv.org/abs/2506.18233
AI大模型:大力真能出奇迹吗?
26 Jun 2025
Contributed by Lukas
[LG] These are Not All the Features You are Looking For: A Fundamental Bottleneck In Supervised Pretraining[Facebook AI Research (FAIR) at Meta & ...
[感悟]你的“倒霉”,是世界给你的“私教课”
25 Jun 2025
Contributed by Lukas
当你给出“新答案”的瞬间,你会发现,那道反复折磨你的题,就这么烟消云散了。
如何让AI“明辨是非”,而不是“投机取巧”?
25 Jun 2025
Contributed by Lukas
[LG] Robust Reward Modeling via Causal Rubrics[Google DeepMind]https://arxiv.org/abs/2506.16507
我们怎么知道,AI是真的“懂了”?
25 Jun 2025
Contributed by Lukas
[LG] Latent Concept Disentanglement in Transformer-based Language Models[Purdue University & University of Southern California]https://arxiv.org/a...
人多,真的力量大吗?
25 Jun 2025
Contributed by Lukas
[CL] When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework[University of Chicago & Together AI]https://arxiv.org...
给AI做“CT扫描”:一份来自科学家的操作手册
25 Jun 2025
Contributed by Lukas
[LG] On the Theoretical Understanding of Identifiable Sparse Autoencoders and Beyond[Peking University & MIT]https://arxiv.org/abs/2506.15963
养一个“AI神童”,原来是个精细活儿
25 Jun 2025
Contributed by Lukas
[CL] EvoLM: In Search of Lost Language Model Training Dynamics[Harvard & Stanford & EPFL]https://arxiv.org/abs/2506.16029
[感悟]放下“自证”:成年人最高级的自由
24 Jun 2025
Contributed by Lukas
真正的自由,是从不再解释开始。当你不再需要向世界证明什么时,你才真正开始拥有了自己的人生。
AI画画,怎样才能“又快又好”?
24 Jun 2025
Contributed by Lukas
[CV] Align Your Flow: Scaling Continuous-Time Flow Map Distillation [NVIDIA] https://arxiv.org/abs/2506.14603
AI 的“读心术”:我们如何才能信任一个“聪明的大脑”?
24 Jun 2025
Contributed by Lukas
[LG] Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [Yale University & Shanghai Jiao Tong University] &nb...
你的AI在偷偷“修炼”:通往无穷的平坦大道
24 Jun 2025
Contributed by Lukas
[LG] Flat Channels to Infinity in Neural Loss Landscapes [EPFL & Flatiron Institute] https://arxiv.org/abs/2506.14951
从死记硬背到融会贯通,AI的“开窍”秘籍
24 Jun 2025
Contributed by Lukas
[LG] GrokAlign: Geometric Characterisation and Acceleration of Grokking [Rice University & Brown University] https://arxiv.org/abs/250...
[感悟]高手过招:如何砍掉内心的“全都要”?
23 Jun 2025
Contributed by Lukas
真正的自由,从来不是拥有无限的选择权。
AI育儿经:学霸是“刷题”刷出来的,还是“试错”试出来的?
23 Jun 2025
Contributed by Lukas
[CL] AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy [NVIDIA] https://arxiv.org/abs/2506.13284
AI黑箱里的一张新地图
23 Jun 2025
Contributed by Lukas
[LG] Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models [Huazhong University of Science and Technolog & UC Berkeley...
AI程序员“封神”?别急,先看看“奥赛冠军”的体检报告
23 Jun 2025
Contributed by Lukas
[LG] LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? [New York University & Princeton University] ...
顶尖高手过招,为何有时“笨办法”反而更有效?
23 Jun 2025
Contributed by Lukas
[LG] Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling [Max Planck Institute for Intelligent Systems] https...
[感悟]如何干掉那个叫“明天再说”的自己?
22 Jun 2025
Contributed by Lukas
你会发现,决定你五年、十年后成为一个什么样的人,根本不是你那些写在纸上、存在手机里的宏伟计划,而是你...
AI黑箱?不如跟它聊聊天
22 Jun 2025
Contributed by Lukas
[LG] Because we have LLMs, we Can and Should Pursue Agentic Interpretability[Google DeepMind]https://arxiv.org/abs/2506.12152
用嘴造工具:AI定制进入“白话时代”
22 Jun 2025
Contributed by Lukas
[LG] Text-to-LoRA: Instant Transformer Adaption[Sakana AI]https://arxiv.org/abs/2506.06105
AI绘画的秘密:创造力竟然源于“学不会”?
22 Jun 2025
Contributed by Lukas
[LG] On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity[CNRS]https://arxiv.org/abs/2506.03719
AI的记忆革命:如何只记重点,忘掉其它?
22 Jun 2025
Contributed by Lukas
[CL] Don't Pay Attention[Avey AI]https://arxiv.org/abs/2506.11305
[感悟]别再“收藏”了!把笔记变成一台思想印钞机!
21 Jun 2025
Contributed by Lukas
真正的知识管理,不是收藏,而是创造。
让AI自己给自己“立规矩”,结果会怎样?
21 Jun 2025
Contributed by Lukas
[LG] AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning[CMU]https://arxiv.org/abs/2506.15651
你的“土办法”过时了,AI正在打造“策略工具箱”
21 Jun 2025
Contributed by Lukas
[LG] HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges[Microsoft Research Asia]https://arxiv.org/abs/2506.15196
AI大脑里的“万金油”神经元,是臭虫还是宝贝?
21 Jun 2025
Contributed by Lukas
[LG] Dense SAE Latents Are Features, Not Bugs[MIT & ETH Zürich]https://arxiv.org/abs/2506.156
AI界的“调参玄学”:一个被遗忘的旋钮
21 Jun 2025
Contributed by Lukas
[LG] Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size[UC Berkeley & Microsoft Research]https://arxiv.org/abs/2506.15025
AI的“读心术”:模型的大脑会出卖它的秘密吗?
21 Jun 2025
Contributed by Lukas
[CL] Approximating Language Model Training Data from Weights[Cornell University]https://arxiv.org/abs/2506.155
[感悟]如何成为一个高明的“刻意选择者”?
20 Jun 2025
Contributed by Lukas
不被外界的噪音裹挟,不被内心的惯性驱动,清清楚楚地知道自己要什么,不要什么,然后全力以赴,成为一个清...
AI的“强迫症”:为什么你一句话没说完,它就“疯了”?
19 Jun 2025
Contributed by Lukas
[CL] Sampling from Your Language Model One Byte at a Time[University of Washington]https://arxiv.org/abs/2506.14123
为什么“抓重点”让AI学得更快?
19 Jun 2025
Contributed by Lukas
[LG] Transformers Learn Faster with Semantic Focus[IBM Research]https://arxiv.org/abs/2506.14095
AI的“识字”革命:我们离“读懂”世界又近了一步?
19 Jun 2025
Contributed by Lukas
[CL] From Bytes to Ideas: Language Modeling with Autoregressive U-Nets[FAIR at Meta]https://arxiv.org/abs/2506.14761
AI的“选择困难症”:通往更高智慧的秘密通道
19 Jun 2025
Contributed by Lukas
[CL] Reasoning with Exploration: An Entropy Perspective[RUC & MSRA & SJTU]https://arxiv.org/abs/2506.14758
AI训练的“七分饱”智慧
19 Jun 2025
Contributed by Lukas
[LG] Less is More: Undertraining Experts Improves Model Upcycling[Université de Montréal & Concordia University]https://arxiv.org/abs/2506.14126
[感悟]你的效率,藏在截止日期里
19 Jun 2025
Contributed by Lukas
工作量会像气体一样膨胀填满给定的时间容器,因此主动设定紧迫的截止日期能激发最高效率,让你成为时间的主...
为什么你总觉得,没人能真正懂你?
18 Jun 2025
Contributed by Lukas
[LG] Wanting to Be Understood Explains the Meta-Problem of Consciousness[Google DeepMind]https://arxiv.org/abs/2506.12086
AI的“分身术”:为什么你的万能助手不会再轻易“忘事”了?
18 Jun 2025
Contributed by Lukas
[CL] Multipole Attention for Efficient Long Context Reasoning[UC Berkeley]https://arxiv.org/abs/2506.13059
AI的自我修炼:如何让机器拥有“反思”的能力?
18 Jun 2025
Contributed by Lukas
[CL] Direct Reasoning Optimization:LLMs Can Reward And Refine Their Own Reasoning for Open-Ended Tasks[Microsoft]https://arxiv.org/abs/2506.13351
我们用来分辨AI谎言的工具,它自己靠谱吗?
18 Jun 2025
Contributed by Lukas
[LG] Verifying the Verifiers:Unveiling Pitfalls and Potentials in Fact Verifiers[Yonsei University & Stanford University & University of Was...
给AI请个好家教:少即是多的学习智慧
18 Jun 2025
Contributed by Lukas
[CL] Refract ICL:Rethinking Example Selection in the Era of Million-Token Models[Google DeepMind]https://arxiv.org/abs/2506.12346
[感悟]你的人生,到底有多少“可能性”?
18 Jun 2025
Contributed by Lukas
人生的智慧不在于追求一步到位的完美答案,而在于始终为自己保留更多的选择权和可能性。
管中窥豹,竟能知全豹?
17 Jun 2025
Contributed by Lukas
[LG] Spectral Estimation with Free Decompression[UC Berkeley & University of Melbourne]https://arxiv.org/abs/2506.11994
AI 进化论:如何让机器像高手一样思考?
17 Jun 2025
Contributed by Lukas
[LG] TreeRL: LLM Reinforcement Learning with On-Policy Tree Search[Tsinghua University & California Institute of Technology]https://arxiv.org/abs/...
AI突然“开窍”?别激动,这可能不是你想的智能
17 Jun 2025
Contributed by Lukas
[CL] Large Language Models and Emergence: A Complex Systems Perspective[Santa Fe Institute]https://arxiv.org/abs/2506.11135
让AI长颗“人心”:机器如何学会人类的“偏见”?
17 Jun 2025
Contributed by Lukas
[LG] Tversky Neural Networks: Psychologically Plausible Deep Learning with Differentiable Tversky Similarity[Stanford University]https://arxiv.org/abs...
AI的“一招鲜”:如何只训练一次,就让模型十项全能?
17 Jun 2025
Contributed by Lukas
[CL] You Only Fine-tune Once:Many-Shot In-Context Fine-Tuning for Large Language Model[Google & University of Florida]https://arxiv.org/abs/2506...
[感悟]你活在谁的参照系里?
17 Jun 2025
Contributed by Lukas
现代人的焦虑很大程度源于不断的比较。我们习惯用别人的成就来衡量自己,这种"比较"就像快乐的小偷,偷走了本...
AI偷懒的艺术:如何用“草稿”撬动“大模型”?
17 Jun 2025
Contributed by Lukas
[CL] Draft-based Approximate Inference for LLMs[FuriosaAI & UW-Madison]https://arxiv.org/abs/2506.08373
AI世界的“社交难题”:光会拉黑还不够
17 Jun 2025
Contributed by Lukas
[LG] On the Similarities of Embeddings in Contrastive Learning[Yonsei University]https://arxiv.org/abs/2506.09781
AI的“鱼与熊掌”:既要跑得快,又要学得好?
17 Jun 2025
Contributed by Lukas
[LG] Sequential-Parallel Duality in Prefix Scannable Models[MIT CSAIL & Technical University of Munich]https://arxiv.org/abs/2506.10918
AI的终极进化:当机器学会给自己“划重点”
17 Jun 2025
Contributed by Lukas
[LG] Self-Adapting Language Models[MIT]https://arxiv.org/abs/2506.10943
AI的“开窍”秘诀:不是更聪明,而是会“偷懒”
17 Jun 2025
Contributed by Lukas
[LG] CoRT: Code-integrated Reasoning within Thinking[University of Science and Technology of China & Qwen Team & The Chinese University of Hon...
[论文品读]意图条件流占用模型
16 Jun 2025
Contributed by Lukas
[LG] Intention-Conditioned Flow Occupancy Models C Zheng, S Park, S Levine, B Eysenbach [Princeton University & UC Berkeley] 本...
[论文品读]用大语言模型求解不等式证明
16 Jun 2025
Contributed by Lukas
[LG] Solving Inequality Proofs with Large Language Models J Sheng, L Lyu, J Jin, T Xia... [Stanford University & UC Berkeley] 本...
[论文品读]强化学习教师的推理时扩展
16 Jun 2025
Contributed by Lukas
[LG] Reinforcement Learning Teachers of Test Time Scaling E Cetin, T Zhao, Y Tang [Sakana AI] 本文通过提出强化学习教师(...
[论文品读]分支薛定谔桥匹配
16 Jun 2025
Contributed by Lukas
[LG] Branched Schrödinger Bridge Matching S Tang, Y Zhang, A Tong, P Chatterjee [Duke-NUS Medical School & Quebec AI Institute]  ...
[论文品读]扩散二象性
15 Jun 2025
Contributed by Lukas
[LG] The Diffusion Duality S S Sahoo, J Deschenaux, A Gokaslan, G Wang, J Chiu, V Kuleshov [Cornell Tech & EPFL Lausanne] 本文...
AI前沿:从并行思维到超快规划
13 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的四项前沿研究,带来耳目一新的洞见:1.《Multiverse: Your Language Models Secretly Decid...
AI前沿:从推理外推到医疗影像
12 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,带来耳目一新的洞见: “e3: Learning to Explore Enables Extrapolation o...
AI前沿:从神经网络学习到智能体行动
11 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,涵盖了神经网络学习理论、语言模型训练、优化技术、模型效率...
AI前沿:从自适应模型到关系推理的未来
10 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,揭示了模型自适应、效率提升及复杂数据处理的新突破。包括:...
AI前沿:从模型思维到代码进化
09 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五项AI前沿研究的关键进展:1.《When Does Closeness in Distribution Imply Representational Similarity? ...
AI前沿:AI推理的幻觉与突破
08 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,揭示了AI推理能力的真实面貌与优化策略,并展望了机器人技术...
AI前沿:从模型嫁接到遗忘之谜
07 Jun 2025
Contributed by Lukas
本期“TAI快报”深入探讨了五篇AI前沿论文的关键内容:1.《Exploring Diffusion Transformer Designs via Grafting》提出了“嫁接...
AI前沿:从长文本生成到无人驾驶
06 Jun 2025
Contributed by Lukas
本期“TAI快报”深入探讨了AI领域的五项前沿研究:1. “Rectified Sparse Attention”通过周期性校准解决长文本生成中的...
AI前沿:从语言偏差到训练谜团
05 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了人工智能领域的五项前沿研究,揭示了AI模型设计与训练中的隐藏挑战与创新突破。首先...
AI前沿:从语言适配到智能体思维
04 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了人工智能领域的五大前沿研究,涵盖语言模型适配、能力评估、智能体思维、训练策略...
AI前沿:从分钟级游戏学习到语言模型的记忆之谜
03 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五项AI领域的最新研究成果,带来前沿技术洞见:1. “Test-Time Training Done Right” 通过大块...
AI前沿:从Adam到偏好学习的性能之谜
02 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,涵盖优化算法、强化学习、学术工具、脑启发计算和偏好学习。...
AI前沿:从分布比较到语言模型的“游荡”本质
01 Jun 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI领域的前沿论文,揭示了多项关键进展: 《Kernel Quantile Embeddings and Associated Proba...
AI前沿:隐秘动力与高效学习
31 May 2025
Contributed by Lukas
本期“TAI快报”深入探讨了五篇AI领域的前沿论文,揭示了模型内部机制与优化策略的新视角。包括:通过动力系统...
AI前沿:从熵管理到长思维链的秘密
30 May 2025
Contributed by Lukas
本期“TAI快报”深入探讨了AI推理能力的五大前沿研究,揭示了提升AI“思考”能力的新路径。包括:通过熵管理解...
AI前沿:从语言操控到自我反思
29 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI领域的前沿论文,揭示了语言模型操控、推理反思、自我训练及多语言能力增强的...
AI前沿:小模型也能聪明,数据选择有大招
28 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究:1.《Small Models, Smarter Learning: The Power of Joint Task Training》揭示联合...
AI前沿:从数据污染检测到高效推理
27 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究:1.《How Can I Publish My LLM Benchmark Without Giving the True Answers Away?》提...
AI前沿:从几何对称到推理控制
26 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI领域的前沿论文,带来以下关键洞见: AdS-GNN - a Conformally Equivariant Graph Neural Netw...
AI前沿:从自适应思考到像素推理
25 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五项AI前沿研究:1.《Thinkless: LLM Learns When to Think》提出自适应推理框架,让语言模型根...
AI前沿:从“快思慢想”到文本水印的前沿探索
24 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五项AI前沿研究:1. “Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning”揭...
AI前沿:从软性推理到自设计智能体的突破
23 May 2025
Contributed by Lukas
本期“TAI快报”深入探讨了AI领域的五项前沿研究,涵盖文本生成、推理优化、用户反馈学习、训练课程设计和多智...
AI前沿:从慢思考到高效推理
22 May 2025
Contributed by Lukas
本期《TAI快报》聚焦AI“思考”艺术,深入探讨了五项前沿研究:1.《Reward Reasoning Model》提出AI在评价前先“思考”...
AI前沿:从破碎表征到高效计算的突破
21 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI领域的前沿论文,带来耳目一新的洞见。首先,“Questioning Representational Optimism in...
AI前沿:从语言对齐到游戏建模
20 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了AI领域的五项前沿研究,涵盖语言模型对齐、代码优化、图像生成和游戏世界建模等多个...
AI前沿:AI如何突破多语言、稀疏学习与药物设计的边界
19 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了四篇AI领域的前沿论文,涵盖多语言多模态、稀疏函数学习、分子设计和时间感知等方向...
AI前沿:AI如何颠覆数学、音乐与经济
18 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI领域的前沿论文,揭示了AI在数学、音频生成、经济分析、数据筛选及分布式训练...
AI前沿:从热力学到进化论
17 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI前沿论文的关键洞见: 《Neural Thermodynamic Laws for Large Language Model Training》提出神...
AI前沿:从自我奖励到因果推理的突破
16 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI前沿论文,涵盖语言模型的自主学习、神经网络在线学习、上下文处理机制、机器...
AI前沿:从困惑到推理解锁语言模型的秘密
15 May 2025
Contributed by Lukas
本期《TAI快报》深入探讨了五篇AI前沿论文,揭示了大语言模型的概率一致性、推理能力、效率优化与对齐机制的最...