Neurips 2023 - Evaluating Cognitive Maps and Planning in Large Language Models with CogEval - AI Breakdown | Transcription & Insights

Audio

Description

In this episode we discuss Evaluating Cognitive Maps and Planning in Large Language Models with CogEval by Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson. The paper presents CogEval, a protocol designed to evaluate the cognitive abilities of Large Language Models (LLMs). The authors note the lack of rigorous evaluation in previous studies claiming human-level cognitive abilities in LLMs and propose CogEval as a framework for systematic evaluation. They apply CogEval to assess the cognitive maps and planning skills of eight different LLMs, finding that while they perform well in simpler planning tasks, there are significant failure modes such as hallucinations and being trapped in loops, indicating a lack of understanding of underlying cognitive structures.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes

🗳️ Sign in to Upvote

Popular episodes get transcribed faster

AI Breakdown

Neurips 2023 - Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

This episode hasn't been transcribed yet

Other recent transcribed episodes

13:00H | 21 DIC 2025 | Fin de Semana

10:00H | 21 DIC 2025 | Fin de Semana

12:00H | 20 DIC 2025 | Fin de Semana

2ª PARTE | 06 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 22 ENE 2026 | EL PARTIDAZO DE COPE

3ª PARTE | 04 MAR 2026 | EL PARTIDAZO DE COPE

Sign in to Audioscrape

Share this moment