Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

arxiv preprint - Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

01 Feb 2024

Description

In this episode, we discuss Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens by Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi. The paper introduces an improved n-gram language model named "Infini-gram," which scales to 1.4 trillion tokens and has the capacity to use n-grams of arbitrary length. The authors develop a suffix array-powered engine called infini-gram that calculates probabilities for these extended n-grams quickly, without the need for pre-computing count tables. This new framework demonstrated its utility by enhancing the performance of neural large language models and revealing limitations in machine-generated text, and the authors have made the engine available as an open-source tool for further research.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.