Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

arxiv preprint - VideoLLM-online: Online Video Large Language Model for Streaming Video

25 Jun 2024

Description

In this episode, we discuss VideoLLM-online: Online Video Large Language Model for Streaming Video by Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou. The paper discusses the development of the Learning-In-Video-Stream (LIVE) framework, which improves large multimodal models' ability to handle real-time streaming video inputs. The framework includes a training objective for continuous input, data generation for streaming dialogue, and an optimized inference pipeline, leading to enhanced performance and speed. This innovation, demonstrated through the VideoLLM-online model built on Llama-2/Llama-3, shows significant improvements in handling streaming videos and achieves state-of-the-art performance in various video-related tasks.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.