Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

08 Sep 2025

Description

In this episode, we discuss Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents by Davide Paglieri, Bartłomiej Cupiał, Jonathan Cook, Ulyana Piterbarg, Jens Tuyls, Edward Grefenstette, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel. The paper introduces a framework enabling large language model agents to dynamically decide when to plan during task execution, improving efficiency and performance. They propose a two-stage training process combining supervised fine-tuning and reinforcement learning to develop this capability. Experiments show these dynamically planning agents are more sample-efficient, achieve complex goals better, and can be guided by human plans.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.