Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI: post transformers

RetNet: Retentive Networks: Transformer Successor for Large Language Models

02 Nov 2025

Description

The August 9, 2023 paper introduces the **Retentive Network (RetNet)**, a proposed foundational architecture for large language models intended to succeed the **Transformer** model. RetNet aims to overcome the Transformer's inefficiencies during inference by simultaneously achieving **training parallelism**, **low-cost inference**, and **strong performance**, a combination previously considered an "impossible triangle." The core of RetNet is the **retention mechanism**, which supports three computation paradigms—**parallel, recurrent, and chunkwise recurrent**—to enable efficient training and constant-time, O(1) inference, leading to significant reductions in GPU memory, latency, and increased throughput compared to the Transformer. Experimental results across various model sizes and tasks demonstrate that RetNet is competitive in performance and offers superior efficiency in both training and deployment.Source:https://arxiv.org/pdf/2307.08621

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.