This September 2025 paper describes THOR (Tool-Integrated Hierarchical Optimization via RL), a novel approach designed to enhance the mathematical reasoning and code generation capabilities of Large Language Models (LLMs) by integrating external code-execution tools. The methodology introduces TIRGen, a pipeline for creating high-quality Tool-Integrated Reasoning (TIR) data, which is crucial for training the model using a hierarchical reinforcement learning (RL) strategy. This RL framework incorporates both trajectory-level optimization for overall problem-solving ability and step-level optimization to correct code generation errors, addressing the sparse reward problem common in long reasoning tasks. Experimental results demonstrate that THOR achieves state-of-the-art (SOTA) performance across various mathematical and code benchmarks for both reasoning and non-reasoning models. Finally, the system leverages code execution feedback for a self-correction inference enhancement mechanism, which further improves performance, especially on more challenging problems.Source:https://arxiv.org/pdf/2509.13761
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Erich G. Anderer, Chief of the Division of Neurosurgery and Surgical Director of Perioperative Services at NYU Langone Hospital–Brooklyn
09 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Nolan Wessell, Assistant Professor and Well-being Co-Director, Department of Orthopedic Surgery, Division of Spine Surgery, University of Colorado School of Medicine
08 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
NPR News: 12-08-2025 2AM EST
08 Dec 2025
NPR News Now
NPR News: 12-08-2025 1AM EST
08 Dec 2025
NPR News Now