Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

Arxiv paper - ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

04 Mar 2025

Description

In this episode, we discuss ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models by Jonathan Roberts, Mohammad Reza Taesiri, Ansh Sharma, Akash Gupta, Samuel Roberts, Ioana Croitoru, Simion-Vlad Bogolin, Jialu Tang, Florian Langer, Vyas Raina, Vatsal Raina, Hanyi Xiong, Vishaal Udandarao, Jingyi Lu, Shiyang Chen, Sam Purkis, Tianshuo Yan, Wenye Lin, Gyungin Shin, Qiaochu Yang, Anh Totti Nguyen, Kai Han, Samuel Albanie. The paper reveals that Large Multimodal Models (LMMs) have significant difficulties with image interpretation and spatial reasoning, often underperforming compared to young children or animals. To address this gap, the authors introduce ZeroBench, a challenging visual reasoning benchmark comprising 100 carefully designed questions and 334 subquestions that current LMMs cannot solve. Evaluation of 20 models resulted in a 0% score on ZeroBench, and the benchmark is publicly released to stimulate advancements in visual understanding.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.