Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

Neurips 2023 - Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

07 Oct 2023

Description

In this episode we discuss Evaluating Cognitive Maps and Planning in Large Language Models with CogEval by Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson. The paper presents CogEval, a protocol designed to evaluate the cognitive abilities of Large Language Models (LLMs). The authors note the lack of rigorous evaluation in previous studies claiming human-level cognitive abilities in LLMs and propose CogEval as a framework for systematic evaluation. They apply CogEval to assess the cognitive maps and planning skills of eight different LLMs, finding that while they perform well in simpler planning tasks, there are significant failure modes such as hallucinations and being trapped in loops, indicating a lack of understanding of underlying cognitive structures.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.