Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

Arxiv paper - HD-EPIC: A Highly-Detailed Egocentric Video Dataset

26 Mar 2025

Description

In this episode, we discuss HD-EPIC: A Highly-Detailed Egocentric Video Dataset by Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen. The paper introduces HD-EPIC, a 41-hour dataset of egocentric kitchen videos collected from diverse home environments and meticulously annotated with detailed 3D-grounded labels, including recipe steps, actions, ingredients, and audio events. It features a challenging visual question answering benchmark with 26,000 questions, where current models like Gemini Pro achieve only 38.5% accuracy, underscoring the dataset's complexity and the limitations of existing vision-language models. Additionally, HD-EPIC supports various tasks such as action recognition and video-object segmentation, providing a valuable resource for enhancing real-world kitchen scenario understanding.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.