AI Breakdown
arxiv preprint - NEPTUNE: THE LONG ORBIT TO BENCHMARKING LONG VIDEO UNDERSTANDING
07 Oct 2024
In this episode, we discuss NEPTUNE: THE LONG ORBIT TO BENCHMARKING LONG VIDEO UNDERSTANDING by The authors of the paper "NEPTUNE: THE LONG ORBIT TO BENCHMARKING LONG VIDEO UNDERSTANDING" are: - Arsha Nagrani - Mingda Zhang - Ramin Mehran - Rachel Hornung - Nitesh Bharadwaj Gundavarapu - Nilpa Jha - Austin Myers - Xingyi Zhou - Boqing Gong - Cordelia Schmid - Mikhail Sirotenko - Yukun Zhu - Tobias Weyand. The paper introduces "Neptune," a semi-automatic system designed to generate complex question-answer-decoy sets from long video content to enhance comprehension tasks typically limited to short clips. Leveraging large models like Vision-Language Models and Large Language Models, Neptune creates detailed, time-aligned captions and intricate QA sets for videos up to 15 minutes long, aiming to improve annotation efficiency. The dataset emphasizes multimodal reasoning and introduces the GEM metric for evaluating responses, revealing current long video models' weaknesses in understanding temporal and state changes.
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
SpaceX Said to Pursue 2026 IPO
10 Dec 2025
Bloomberg Tech
Don’t Call It a Comeback
10 Dec 2025
Motley Fool Money
Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines
10 Dec 2025
The Daily AI Show
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
What it will take for AI to scale (energy, compute, talent)
10 Dec 2025
Azeem Azhar's Exponential View
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast